hdx.scraper.outputs.googlesheets

GoogleSheets Objects

class GoogleSheets(BaseOutput)

[view_source]

GoogleSheets class enabling writing to Google spreadsheets.

Arguments:

  • configuration Dict - Configuration for Google Sheets
  • gsheet_auth str - Authorisation for Google Sheets/Drive
  • updatesheets List[str] - List of spreadsheets to update (eg. prod, test)
  • tabs Dict[str, str] - Dictionary of mappings from internal name to spreadsheet tab name
  • updatetabs List[str] - Tabs to update

update_tab

def update_tab(tabname: str,
               values: Union[List, DataFrame],
               hxltags: Optional[Dict] = None,
               limit: Optional[int] = None) -> None

[view_source]

Update tab with values

Arguments:

  • tabname str - Tab to update
  • values Union[List, DataFrame] - Values in a list of lists or a DataFrame
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.
  • limit Optional[int] - Maximum number of rows to output

Returns:

None

hdx.scraper.outputs.base

BaseOutput Objects

class BaseOutput()

[view_source]

Base class for output that can also be used for testing as it does nothing.

Arguments:

  • updatetabs List[str] - Tabs to update

update_tab

def update_tab(tabname: str,
               values: Union[List, DataFrame],
               hxltags: Optional[Dict] = None,
               **kwargs: Any) -> None

[view_source]

Update tab with values. Classes that inherit from this one should implement this method.

Arguments:

  • tabname str - Tab to update
  • values Union[List, DataFrame] - Values in a list of lists or a DataFrame
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.
  • **kwargs Any - Keyword arguments

Returns:

None

add_data_row

def add_data_row(key: str, row: Dict) -> None

[view_source]

Add row

Arguments:

  • key str - Key to update
  • row Dict - Row to add

Returns:

None

add_dataframe_rows

def add_dataframe_rows(key: str,
                       df: DataFrame,
                       hxltags: Optional[Dict] = None) -> None

[view_source]

Add rows from dataframe under a key

Arguments:

  • key str - Key in JSON to update
  • df DataFrame - Dataframe containing rows
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

add_data_rows_by_key

def add_data_rows_by_key(name: str,
                         countryiso: str,
                         rows: List[Dict],
                         hxltags: Optional[Dict] = None) -> None

[view_source]

Add rows under both a key and an ISO 3 country code subkey

Arguments:

  • key str - Key to update
  • countryiso str - Country to use as subkey
  • rows List[Dict] - List of dictionaries
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

add_additional

def add_additional() -> None

[view_source]

Download files and add them under keys defined in the configuration

Returns:

None

save

def save(**kwargs: Any) -> None

[view_source]

Save file

Arguments:

  • **kwargs - Variables to use when evaluating template arguments

Returns:

None

hdx.scraper.outputs.json

JsonFile Objects

class JsonFile(BaseOutput)

[view_source]

JsonFile class enabling writing to JSON files.

Arguments:

  • configuration Dict - Configuration for Google Sheets
  • updatetabs List[str] - Tabs to update
  • suffix str - A suffix to add to keys. Default is _data.

add_data_row

def add_data_row(key: str, row: Dict) -> None

[view_source]

Add row to JSON under a key

Arguments:

  • key str - Key in JSON to update
  • rows List[Dict] - List of dictionaries

Returns:

None

add_dataframe_rows

def add_dataframe_rows(key: str,
                       df: DataFrame,
                       hxltags: Optional[Dict] = None) -> None

[view_source]

Add rows from dataframe under a key

Arguments:

  • key str - Key in JSON to update
  • df DataFrame - Dataframe containing rows
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

add_data_rows_by_key

def add_data_rows_by_key(key: str,
                         countryiso: str,
                         rows: List[Dict],
                         hxltags: Optional[Dict] = None) -> None

[view_source]

Add rows under both a key and an ISO 3 country code subkey

Arguments:

  • key str - Key in JSON to update
  • countryiso str - Country to use as subkey
  • rows List[Dict] - List of dictionaries
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

generate_json_from_list

def generate_json_from_list(key: str, rows: List[Dict]) -> None

[view_source]

Generate JSON from key and rows list

Arguments:

  • key str - Key in JSON to update
  • rows List[Dict] - List of dictionaries

Returns:

None

generate_json_from_df

def generate_json_from_df(key: str, df: DataFrame,
                          hxltags: Optional[Dict]) -> None

[view_source]

Generate JSON from key and dataframe

Arguments:

  • key str - Key in JSON to update
  • df DataFrame - Dataframe containing rows
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

update_tab

def update_tab(tabname: str,
               values: Union[List, DataFrame],
               hxltags: Optional[Dict] = None) -> None

[view_source]

Update tab with values

Arguments:

  • tabname str - Tab to update
  • values Union[List, DataFrame] - Values in a list of lists or a DataFrame
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

add_additional

def add_additional() -> None

[view_source]

Download JSON files and add them under keys defined in the configuration

Returns:

None

save

def save(folder: Optional[str] = None, **kwargs: Any) -> List[str]

[view_source]

Save JSON file and any addition subsets of that JSON defined in the additional configuration

Arguments:

  • folder Optional[str] - Folder to save to. Defaults to None.
  • **kwargs - Variables to use when evaluating template arguments

Returns:

  • List[str] - List of file paths

hdx.scraper.outputs.excelfile

ExcelFile Objects

class ExcelFile(BaseOutput)

[view_source]

ExcelFile class enabling writing to Excel spreadsheets.

Arguments:

  • excel_path str - Path to output spreadsheet
  • tabs Dict[str, str] - Dictionary of mappings from internal name to spreadsheet tab name
  • updatetabs List[str] - Tabs to update

update_tab

def update_tab(tabname: str,
               values: Union[List, DataFrame],
               hxltags: Optional[Dict] = None) -> None

[view_source]

Update tab with values

Arguments:

  • tabname str - Tab to update
  • values Union[List, DataFrame] - Values in a list of lists or a DataFrame
  • hxltags Optional[Dict] - HXL tag mapping. Defaults to None.

Returns:

None

save

def save() -> None

[view_source]

Save spreadsheet

Returns:

None