brightwind.load.load.apply_cleaning¶

brightwind.load.load.apply_cleaning(data, cleaning_file_or_df, inplace=False, sensor_col_name='Sensor', date_from_col_name='Start', date_to_col_name='Stop', all_sensors_descriptor='All', replacement_text='NaN')¶

Apply cleaning to a DataFrame using predetermined flagged periods for each sensor listed in a cleaning file. The flagged data will be replaced with NaN values which then do not appear in any plots or effect calculations.

Parameters

data (pandas.DataFrame) – Data to be cleaned.
cleaning_file_or_df (str, pd.DataFrame) – File path of the csv file or a pandas DataFrame which contains the list of sensor names along with the start and end timestamps of the periods that are flagged.
inplace (Boolean) – If ‘inplace’ is True, the original data, ‘data’, will be modified and and replaced with the cleaned data. If ‘inplace’ is False, the original data will not be touched and instead a new object containing the cleaned data is created. To store this cleaned data, please ensure it is assigned to a new variable.
sensor_col_name (str, default 'Sensor') – The column name which contains the list of sensor names that have flagged periods.
date_from_col_name (str, default 'Start') – The column name of the date_from or the start date of the period to be cleaned.
date_to_col_name (str, default 'Stop') – The column name of the date_to or the end date of the period to be cleaned.
all_sensors_descriptor (str, default 'All') – A text descriptor that represents ALL sensors in the DataFrame.
replacement_text (str, default 'NaN') – Text used to replace the flagged data.

Returns

DataFrame with the flagged data removed.

Return type

pandas.DataFrame

Example usage

import brightwind as bw

Load data:: data = bw.load_csv(r’C:UsersStephenDocumentsAnalysisdemo_data’) cleaning_file = r’C:UsersStephenDocumentsAnalysisdemo_cleaning_file.csv’
To apply cleaning to ‘data’ and store the cleaned data in ‘data_cleaned’:: data_cleaned = bw.apply_cleaning(data, cleaning_file) print(data_cleaned)
To modify ‘data’ and replace it with the cleaned data:: bw.apply_cleaning(data, cleaning_file, inplace=True) print(data)
To apply cleaning where the cleaning file has column names other than defaults::: cleaning_file = r’C:somefoldercleaning_file.csv’ data = bw.apply_cleaning(data, cleaning_file, sensor_col_name=’Data column’,

date_from_col_name=’Start Time’, date_to_col_name=’Stop Time’)