brightwind.load.load.load_excel

brightwind.load.load.load_excel(filepath_or_folder, search_by_file_type=['.xlsx'], print_progress=True, sheet_name=0, **kwargs)

Load timeseries data from an Excel file, or group of files in a folder, into a DataFrame. The format of the Excel file should be column headings in the first row with the timestamp column as the first column, however these can be over written by sending your own arguments as this is a wrapper around the pandas.read_excel function. The pandas.read_excel documentation can be found at: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

Parameters
  • filepath_or_folder (str) – Location of the file folder containing the timeseries data.

  • search_by_file_type (List[str], default .xlsx) – Is a list of file extensions to search for e.g. [‘.xlsx’] if a folder is sent.

  • print_progress (bool, default True) – If you want to print out statements of the file been processed set to True. Default is True.

  • sheet_name (string, int, mixed list of strings/ints, or None, default 0) – The Excel file sheet name you want to read from.

  • kwargs – All the kwargs from pandas.read_excel can be passed to this function.

Returns

A DataFrame with timestamps as it’s index.

Return type

pandas.DataFrame

When assembling files from folders into a single DataFrame with timestamp as the index it automatically checks for duplicates and throws an error if any found.

Example usage

import brightwind as bw
filepath = r'C:\some\folder\some_data.xlsx'
df = bw.load_excel(filepath)
print(df)

To load a group of files from a folder other than a .csv file type:

folder = r'C:\some\folder\with\excel\files'
df = bw.load_excel(folder, print_progress=True)

If you want to load something that is different from a standard file where the column headings are not in the first row, the pandas.read_excel key word arguments (kwargs) can be used:

filepath = r'C:\some\folder\some_data_with_column_headings_on_second_line.xlsx'
df = bw.load_excel(filepath, skiprows=0)