How to apply a cleaning file¶
[1]:
# This cell is just used to record the date of this tutorial and is not part of the tutorial.
import datetime
print('Last updated: {}'.format(datetime.date.today().strftime('%d %B, %Y')))
Last updated: 19 July, 2019
Outline:¶
This guide will demonstrate how to use the apply_cleaning()
and apply_cleaning_windographer()
functions to clean data using pre-existing file which denote the cleaning to be applied. The tutorial includes the following steps:
1. ``apply_cleaning()`` with a simple csv:
import the brightwind library and some sample data
plot monthly means of raw wind speed
apply cleaning to data based on a pre-existing .csv cleaning file:
plot monthly means of cleaned wind speed
overview of cleaning file structure
2. ``apply_cleaning_windographer()`` with a windographer file:
import some sample data
plot monthly means of raw wind speed
apply cleaning to data based on a pre-existing windographer cleaning file:
plot monthly means of cleaned wind speed
1. apply_cleaning()
with a simple csv:¶
Import the brightwind library and some sample data:¶
[2]:
import brightwind as bw
[3]:
# specify location of existing sample dataset
data_file_path = r'C:\Users\Stephen\Documents\Analysis\demo_data.csv'
# load data as dataframe
data = bw.load_csv(data_file_path)
# show first few rows of dataframe
data.head(5)
[3]:
Spd80mN | Spd80mS | Spd60mN | Spd60mS | Spd40mN | Spd40mS | Spd80mNStd | Spd80mSStd | Spd60mNStd | Spd60mSStd | ... | Dir78mSStd | Dir58mS | Dir58mSStd | Dir38mS | Dir38mSStd | T2m | RH2m | P2m | PrcpTot | BattMin | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Timestamp | |||||||||||||||||||||
2016-01-09 15:30:00 | 8.370 | 7.911 | 8.160 | 7.849 | 7.857 | 7.626 | 1.240 | 1.075 | 1.060 | 0.947 | ... | 6.100 | 110.1 | 6.009 | 112.2 | 5.724 | 0.711 | 100.0 | 935.0 | 0.0 | 12.94 |
2016-01-09 15:40:00 | 8.250 | 7.961 | 8.100 | 7.884 | 7.952 | 7.840 | 0.897 | 0.875 | 0.900 | 0.855 | ... | 5.114 | 110.9 | 4.702 | 109.8 | 5.628 | 0.630 | 100.0 | 935.0 | 0.0 | 12.95 |
2016-01-09 17:00:00 | 7.652 | 7.545 | 7.671 | 7.551 | 7.531 | 7.457 | 0.756 | 0.703 | 0.797 | 0.749 | ... | 4.172 | 113.1 | 3.447 | 111.8 | 4.016 | 1.126 | 100.0 | 934.0 | 0.0 | 12.75 |
2016-01-09 17:10:00 | 7.382 | 7.325 | 6.818 | 6.689 | 6.252 | 6.174 | 0.844 | 0.810 | 0.897 | 0.875 | ... | 4.680 | 118.8 | 5.107 | 115.6 | 5.189 | 0.954 | 100.0 | 934.0 | 0.0 | 12.71 |
2016-01-09 17:20:00 | 7.977 | 7.791 | 8.110 | 7.915 | 8.140 | 7.974 | 0.556 | 0.528 | 0.562 | 0.524 | ... | 3.123 | 115.9 | 2.960 | 113.6 | 3.540 | 0.863 | 100.0 | 934.0 | 0.0 | 12.69 |
5 rows × 29 columns
Plot monthly means of raw data:¶
[4]:
# create list of columns which include anemometer wind speed data
anemometers = ['Spd80mN', 'Spd80mS', 'Spd60mN', 'Spd60mS', 'Spd40mN', 'Spd40mS']
[5]:
# plot monthly means of wind speed for each anemometer
bw.monthly_means(data[anemometers])
[5]:
Note the spurious ‘Spd80mS’ monthly mean values which can be seen for the last few months.
Apply cleaning to data based on a pre-existing cleaning file:¶
[6]:
# specify location of associated cleaning file
cleaning_file_path = r'C:\Users\Stephen\Documents\Analysis\demo_cleaning_file.csv'
Data can be cleaned either in place or by assigning the cleaned data to a new variable. By deafult, the original data is untouched and a new DataFrame containing the cleaned data is created.
[7]:
# apply cleaning
clean_data = bw.apply_cleaning(data, cleaning_file_path)
To apply cleaning in place, i.e. replace the original data with the cleaned data, set the inplace parameter to True:
[8]:
# apply cleaning in place
bw.apply_cleaning(data, cleaning_file_path, inplace=True)
[8]:
Spd80mN | Spd80mS | Spd60mN | Spd60mS | Spd40mN | Spd40mS | Spd80mNStd | Spd80mSStd | Spd60mNStd | Spd60mSStd | ... | Dir78mSStd | Dir58mS | Dir58mSStd | Dir38mS | Dir38mSStd | T2m | RH2m | P2m | PrcpTot | BattMin | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Timestamp | |||||||||||||||||||||
2016-01-09 15:30:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2016-01-09 15:40:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2016-01-09 17:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2016-01-09 17:10:00 | 7.382 | 7.325 | 6.818 | 6.689 | 6.252 | 6.174 | 0.844 | 0.810 | 0.897 | 0.875 | ... | 4.680 | 118.80 | 5.107 | 115.60 | 5.189 | 0.954 | 100.0 | 934.0 | 0.0 | 12.71 |
2016-01-09 17:20:00 | 7.977 | 7.791 | 8.110 | 7.915 | 8.140 | 7.974 | 0.556 | 0.528 | 0.562 | 0.524 | ... | 3.123 | 115.90 | 2.960 | 113.60 | 3.540 | 0.863 | 100.0 | 934.0 | 0.0 | 12.69 |
2016-01-09 17:30:00 | 8.340 | 8.160 | 8.370 | 8.170 | 8.330 | 8.180 | 0.676 | 0.607 | 0.756 | 0.708 | ... | 3.260 | 117.20 | 3.600 | 117.40 | 4.526 | 0.731 | 100.0 | 934.0 | 0.0 | 12.67 |
2016-01-09 17:40:00 | 8.130 | 7.929 | 8.090 | 7.895 | 7.972 | 7.788 | 0.557 | 0.507 | 0.534 | 0.498 | ... | 3.677 | 115.90 | 3.371 | 115.90 | 3.515 | 0.852 | 100.0 | 933.0 | 0.0 | 12.68 |
2016-01-09 17:50:00 | 7.480 | 7.283 | 7.706 | 7.486 | 7.649 | 7.481 | 0.588 | 0.526 | 0.590 | 0.529 | ... | 3.500 | 119.90 | 3.265 | 118.90 | 3.322 | 0.771 | 100.0 | 933.0 | 0.0 | 12.67 |
2016-01-09 18:00:00 | 7.554 | 7.452 | 7.484 | 7.359 | 7.578 | 7.456 | 0.734 | 0.681 | 0.631 | 0.573 | ... | 3.061 | 113.90 | 2.663 | 111.40 | 2.789 | 0.913 | 100.0 | 933.0 | 0.0 | 12.65 |
2016-01-09 18:10:00 | 8.220 | 8.070 | 8.080 | 7.888 | 7.791 | 7.639 | 0.743 | 0.724 | 0.800 | 0.757 | ... | 3.427 | 115.10 | 4.055 | 113.80 | 4.834 | 0.832 | 100.0 | 933.0 | 0.0 | 12.62 |
2016-01-09 18:20:00 | 9.420 | 9.270 | 9.660 | 9.470 | 9.570 | 9.450 | 0.482 | 0.438 | 0.566 | 0.522 | ... | 2.431 | 119.50 | 2.454 | 117.40 | 3.017 | 0.549 | 100.0 | 933.0 | 0.0 | 12.61 |
2016-01-09 18:30:00 | 9.910 | 9.810 | 9.610 | 9.480 | 8.940 | 8.840 | 0.568 | 0.513 | 0.698 | 0.648 | ... | 3.272 | 114.90 | 3.271 | 113.20 | 4.454 | 0.529 | 100.0 | 933.0 | 0.0 | 12.60 |
2016-01-09 18:40:00 | 10.390 | 10.340 | 10.260 | 10.110 | 9.840 | 9.760 | 0.683 | 0.640 | 0.710 | 0.696 | ... | 4.405 | 110.90 | 4.308 | 109.30 | 4.776 | 0.589 | 100.0 | 933.0 | 0.0 | 12.60 |
2016-01-09 18:50:00 | 10.360 | 10.320 | 10.320 | 10.230 | 9.830 | 9.820 | 0.840 | 0.780 | 0.857 | 0.841 | ... | 3.509 | 108.30 | 3.642 | 106.00 | 4.478 | 0.448 | 100.0 | 932.0 | 0.0 | 12.61 |
2016-01-09 19:00:00 | 9.950 | 9.880 | 10.050 | 9.970 | 9.910 | 9.870 | 0.561 | 0.469 | 0.639 | 0.547 | ... | 3.601 | 102.30 | 3.085 | 100.70 | 3.954 | 0.701 | 100.0 | 932.0 | 0.0 | 12.60 |
2016-01-09 19:10:00 | 9.620 | 9.600 | 9.590 | 9.550 | 9.240 | 9.260 | 0.767 | 0.666 | 0.847 | 0.732 | ... | 4.701 | 100.80 | 4.184 | 99.90 | 4.012 | 0.691 | 100.0 | 932.0 | 0.0 | 12.59 |
2016-01-09 19:20:00 | 9.590 | 9.600 | 9.510 | 9.460 | 9.270 | 9.270 | 0.615 | 0.571 | 0.795 | 0.760 | ... | 5.744 | 109.30 | 5.838 | 105.20 | 5.478 | 0.711 | 100.0 | 932.0 | 0.0 | 12.58 |
2016-01-09 19:30:00 | 10.520 | 10.520 | 10.520 | 10.440 | 10.510 | 10.490 | 0.608 | 0.571 | 0.501 | 0.477 | ... | 2.017 | 122.00 | 2.219 | 118.60 | 2.905 | 0.823 | 100.0 | 932.0 | 0.0 | 12.58 |
2016-01-09 19:40:00 | 8.970 | 8.950 | 9.190 | 9.100 | 9.260 | 9.190 | 0.691 | 0.617 | 0.705 | 0.659 | ... | 3.884 | 123.10 | 4.036 | 119.20 | 3.911 | 0.913 | 100.0 | 931.0 | 0.0 | 12.57 |
2016-01-09 19:50:00 | 7.334 | 7.339 | 7.212 | 7.151 | 6.995 | 6.964 | 1.286 | 1.283 | 1.470 | 1.423 | ... | 13.060 | 113.80 | 12.850 | 112.80 | 9.380 | 1.156 | 100.0 | 931.0 | 0.0 | 12.57 |
2016-01-09 20:00:00 | 5.310 | 5.445 | 4.788 | 4.901 | 4.093 | 4.190 | 1.202 | 1.190 | 1.259 | 1.226 | ... | 16.890 | 92.30 | 16.320 | 82.10 | 20.130 | 1.317 | 100.0 | 931.0 | 0.0 | 12.56 |
2016-01-09 20:10:00 | 4.052 | 4.044 | 3.956 | 3.921 | 3.572 | 3.500 | 0.881 | 0.868 | 0.802 | 0.787 | ... | 16.450 | 106.00 | 18.110 | 98.80 | 18.040 | 1.389 | 100.0 | 931.0 | 0.0 | 12.56 |
2016-01-09 20:20:00 | 3.632 | 3.519 | 3.653 | 3.584 | 3.594 | 3.524 | 0.816 | 0.787 | 0.658 | 0.615 | ... | 11.660 | 95.00 | 10.780 | 90.90 | 10.660 | 1.288 | 100.0 | 931.0 | 0.0 | 12.55 |
2016-01-09 20:30:00 | 6.868 | 6.797 | 6.667 | 6.566 | 6.363 | 6.277 | 1.297 | 1.313 | 1.383 | 1.371 | ... | 7.568 | 112.40 | 5.917 | 111.90 | 6.342 | 1.449 | 100.0 | 931.0 | 0.0 | 12.54 |
2016-01-09 20:40:00 | 3.496 | 3.462 | 4.098 | 4.110 | 4.252 | 4.318 | 1.363 | 1.367 | 1.616 | 1.593 | ... | 13.770 | 104.30 | 12.860 | 106.20 | 12.810 | 1.317 | 100.0 | 931.0 | 0.0 | 12.54 |
2016-01-09 20:50:00 | 2.390 | 2.394 | 2.033 | 2.000 | 1.402 | 1.347 | 0.774 | 0.706 | 0.725 | 0.682 | ... | 21.640 | 98.70 | 33.320 | 78.95 | 39.480 | 0.913 | 100.0 | 931.0 | 0.0 | 12.54 |
2016-01-09 21:00:00 | 3.002 | 3.042 | 2.509 | 2.521 | 2.252 | 2.156 | 1.313 | 1.265 | 1.237 | 1.285 | ... | 22.610 | 73.82 | 25.350 | 56.03 | 33.450 | 1.338 | 100.0 | 931.0 | 0.0 | 12.53 |
2016-01-09 21:10:00 | 3.489 | 3.448 | 3.061 | 3.031 | 2.874 | 2.851 | 0.821 | 0.793 | 0.632 | 0.575 | ... | 10.750 | 74.83 | 11.530 | 66.91 | 12.570 | 1.176 | 100.0 | 930.0 | 0.0 | 12.53 |
2016-01-09 21:20:00 | 4.204 | 4.321 | 3.661 | 3.715 | 2.972 | 3.074 | 0.995 | 0.924 | 1.049 | 1.005 | ... | 10.920 | 83.60 | 11.370 | 75.78 | 14.990 | 1.236 | 100.0 | 930.0 | 0.0 | 12.52 |
2016-01-09 21:30:00 | 6.486 | 6.648 | 5.745 | 5.879 | 4.862 | 5.021 | 1.224 | 1.237 | 1.329 | 1.329 | ... | 12.640 | 69.43 | 12.760 | 65.76 | 14.290 | 1.398 | 100.0 | 930.0 | 0.0 | 12.52 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2017-11-23 06:00:00 | 10.620 | NaN | 10.420 | 10.370 | 10.110 | 10.030 | 0.857 | NaN | 0.803 | 0.752 | ... | NaN | NaN | NaN | 257.20 | 4.303 | 0.589 | 100.0 | 940.0 | 0.1 | 12.71 |
2017-11-23 06:10:00 | 10.720 | NaN | 10.670 | 10.600 | 10.470 | 10.380 | 1.027 | NaN | 0.938 | 0.919 | ... | NaN | NaN | NaN | 258.70 | 4.351 | 0.701 | 100.0 | 940.0 | 0.0 | 12.71 |
2017-11-23 06:20:00 | 10.940 | NaN | 10.650 | 10.600 | 10.290 | 10.210 | 0.708 | NaN | 0.823 | 0.778 | ... | NaN | NaN | NaN | 257.70 | 3.923 | 0.761 | 99.5 | 940.0 | 0.0 | 12.71 |
2017-11-23 06:30:00 | 10.270 | NaN | 10.130 | 9.960 | 9.910 | 9.790 | 0.800 | NaN | 0.865 | 0.808 | ... | NaN | NaN | NaN | 251.60 | 3.722 | 0.792 | 98.0 | 940.0 | 0.0 | 12.71 |
2017-11-23 06:40:00 | 8.180 | NaN | 7.985 | 7.811 | 7.649 | 7.519 | 1.167 | NaN | 1.239 | 1.199 | ... | NaN | NaN | NaN | 242.80 | 4.858 | 0.518 | 98.2 | 941.0 | 0.0 | 12.71 |
2017-11-23 06:50:00 | 8.870 | NaN | 8.570 | 8.330 | 8.020 | 7.873 | 0.676 | NaN | 0.582 | 0.534 | ... | NaN | NaN | NaN | 232.90 | 3.426 | 0.782 | 99.9 | 941.0 | 0.0 | 12.71 |
2017-11-23 07:00:00 | 9.300 | NaN | 8.950 | 8.670 | 8.590 | 8.400 | 0.933 | NaN | 0.994 | 0.937 | ... | NaN | NaN | NaN | 229.30 | 4.575 | 0.761 | 99.5 | 941.0 | 0.0 | 12.71 |
2017-11-23 07:10:00 | 8.130 | NaN | 7.863 | 7.592 | 7.283 | 7.094 | 0.823 | NaN | 0.837 | 0.794 | ... | NaN | NaN | NaN | 225.10 | 4.478 | 0.721 | 99.9 | 941.0 | 0.0 | 12.67 |
2017-11-23 07:20:00 | 8.910 | NaN | 8.500 | 8.250 | 8.020 | 7.864 | 1.063 | NaN | 0.977 | 0.929 | ... | NaN | NaN | NaN | 226.70 | 5.236 | 0.751 | 100.0 | 941.0 | 0.0 | 12.66 |
2017-11-23 07:30:00 | 9.610 | NaN | 9.260 | 8.980 | 8.630 | 8.470 | 0.942 | NaN | 0.923 | 0.860 | ... | NaN | NaN | NaN | 228.10 | 5.127 | 0.670 | 100.0 | 941.0 | 0.0 | 12.66 |
2017-11-23 07:40:00 | 10.900 | NaN | 10.310 | 10.050 | 9.680 | 9.490 | 1.268 | NaN | 1.170 | 1.111 | ... | NaN | NaN | NaN | 232.70 | 5.563 | 0.570 | 100.0 | 941.0 | 0.0 | 12.67 |
2017-11-23 07:50:00 | 11.560 | NaN | 11.080 | 10.790 | 10.400 | 10.190 | 1.022 | NaN | 1.114 | 1.067 | ... | NaN | NaN | NaN | 238.60 | 4.363 | 0.751 | 100.0 | 941.0 | 0.0 | 12.70 |
2017-11-23 08:00:00 | 11.730 | NaN | 11.290 | 11.030 | 10.690 | 10.490 | 0.842 | NaN | 0.834 | 0.759 | ... | NaN | NaN | NaN | 237.40 | 3.395 | 0.771 | 100.0 | 941.0 | 0.0 | 12.70 |
2017-11-23 08:10:00 | 11.700 | NaN | 11.420 | 11.130 | 11.030 | 10.830 | 0.764 | NaN | 0.727 | 0.673 | ... | NaN | NaN | NaN | 235.90 | 3.391 | 0.863 | 99.7 | 941.0 | 0.0 | 12.66 |
2017-11-23 08:20:00 | 12.270 | NaN | 12.020 | 11.760 | 11.510 | 11.330 | 0.774 | NaN | 0.800 | 0.755 | ... | NaN | NaN | NaN | 237.00 | 3.657 | 0.842 | 99.5 | 941.0 | 0.0 | 12.66 |
2017-11-23 08:30:00 | 11.510 | NaN | 11.260 | 10.990 | 10.740 | 10.530 | 0.956 | NaN | 0.903 | 0.857 | ... | NaN | NaN | NaN | 234.60 | 3.988 | 0.701 | 99.3 | 942.0 | 0.0 | 12.65 |
2017-11-23 08:40:00 | 11.380 | NaN | 10.850 | 10.520 | 10.310 | 10.090 | 0.962 | NaN | 0.867 | 0.840 | ... | NaN | NaN | NaN | 235.00 | 4.448 | 0.751 | 99.7 | 942.0 | 0.0 | 12.68 |
2017-11-23 08:50:00 | 10.420 | NaN | 10.090 | 9.810 | 9.590 | 9.380 | 1.034 | NaN | 1.141 | 1.111 | ... | NaN | NaN | NaN | 223.70 | 7.394 | 0.680 | 99.7 | 942.0 | 0.0 | 12.69 |
2017-11-23 09:00:00 | 9.050 | NaN | 8.530 | 8.270 | 7.696 | 7.503 | 0.668 | NaN | 0.763 | 0.681 | ... | NaN | NaN | NaN | 211.60 | 7.172 | 0.812 | 100.0 | 942.0 | 0.0 | 12.70 |
2017-11-23 09:10:00 | 7.484 | NaN | 7.231 | 6.956 | 6.596 | 6.404 | 1.009 | NaN | 0.868 | 0.811 | ... | NaN | NaN | NaN | 213.00 | 7.851 | 0.771 | 98.4 | 942.0 | 0.0 | 12.69 |
2017-11-23 09:20:00 | 7.228 | NaN | 6.903 | 6.691 | 6.273 | 6.089 | 0.756 | NaN | 0.841 | 0.753 | ... | NaN | NaN | NaN | 209.10 | 7.654 | 0.549 | 99.0 | 943.0 | 0.0 | 12.70 |
2017-11-23 09:30:00 | 7.740 | NaN | 7.359 | 7.147 | 6.889 | 6.775 | 0.634 | NaN | 0.618 | 0.586 | ... | NaN | NaN | NaN | 221.00 | 6.078 | 0.711 | 99.6 | 943.0 | 0.0 | 12.52 |
2017-11-23 09:40:00 | 8.380 | NaN | 7.900 | 7.675 | 7.190 | 7.068 | 0.822 | NaN | 0.881 | 0.798 | ... | NaN | NaN | NaN | 221.90 | 4.969 | 0.651 | 98.8 | 943.0 | 0.0 | 12.73 |
2017-11-23 09:50:00 | 9.870 | NaN | 9.250 | 8.970 | 8.450 | 8.250 | 0.954 | NaN | 0.878 | 0.805 | ... | NaN | NaN | NaN | 220.20 | 5.269 | 0.873 | 99.8 | 943.0 | 0.0 | 12.89 |
2017-11-23 10:00:00 | 9.800 | NaN | 9.340 | 9.070 | 8.630 | 8.450 | 1.170 | NaN | 1.136 | 1.076 | ... | NaN | NaN | NaN | 221.90 | 5.125 | 0.731 | 98.6 | 943.0 | 0.0 | 12.94 |
2017-11-23 10:10:00 | 10.480 | NaN | 10.190 | 9.890 | 9.590 | 9.420 | 0.720 | NaN | 0.733 | 0.668 | ... | NaN | NaN | NaN | 222.20 | 4.111 | 0.943 | 99.7 | 943.0 | 0.0 | 13.02 |
2017-11-23 10:20:00 | 9.390 | NaN | 9.120 | 8.850 | 8.520 | 8.340 | 0.659 | NaN | 0.734 | 0.651 | ... | NaN | NaN | NaN | 218.40 | 4.817 | 0.792 | 98.6 | 943.0 | 0.0 | 13.69 |
2017-11-23 10:30:00 | 9.140 | NaN | 8.700 | 8.450 | 8.030 | 7.875 | 0.689 | NaN | 0.821 | 0.732 | ... | NaN | NaN | NaN | 216.00 | 5.784 | 0.802 | 100.0 | 943.0 | 0.0 | 13.86 |
2017-11-23 10:40:00 | 7.927 | NaN | 7.383 | 7.159 | 6.811 | 6.668 | 0.817 | NaN | 0.769 | 0.692 | ... | NaN | NaN | NaN | 219.50 | 5.051 | 0.883 | 100.0 | 943.0 | 0.0 | 13.80 |
2017-11-23 10:50:00 | 7.120 | NaN | 6.617 | 6.404 | 5.865 | 5.749 | 0.537 | NaN | 0.534 | 0.450 | ... | NaN | NaN | NaN | 222.40 | 4.902 | 0.802 | 100.0 | 944.0 | 0.0 | 13.71 |
95629 rows × 29 columns
Plot monthly means of cleaned wind speed:¶
[9]:
# plot monthly means of wind speed for each anemometer
bw.monthly_means(clean_data[anemometers])
[9]:
Note that the spurious ‘Spd80mS’ values have been removed.
Overview of cleaning file structure:¶
[10]:
# load cleaning file
cleaning_file = bw.load_cleaning_file(cleaning_file_path)
# show cleaning file
cleaning_file
[10]:
Sensor | Start | Stop | Reason | |
---|---|---|---|---|
0 | All | 2016-01-09 15:30:00 | 2016-01-09 17:10:00 | Installation |
1 | Spd | 2016-03-09 06:20:00 | 2016-03-09 10:30:00 | Icing |
2 | Dir | 2016-03-09 06:20:00 | 2016-03-09 10:30:00 | Icing |
3 | Spd | 2016-03-29 23:50:00 | 2016-03-30 07:10:00 | Icing |
4 | Dir | 2016-03-29 23:50:00 | 2016-03-30 07:10:00 | Icing |
5 | Spd | 2016-11-08 02:30:00 | 2016-11-08 10:50:00 | Icing |
6 | Dir | 2016-11-08 02:30:00 | 2016-11-08 10:50:00 | Icing |
7 | Spd | 2016-11-18 15:50:00 | 2016-11-19 10:30:00 | Icing |
8 | Dir | 2016-11-18 15:50:00 | 2016-11-19 10:30:00 | Icing |
9 | Spd | 2016-11-20 16:40:00 | 2016-11-21 12:40:00 | Icing |
10 | Dir | 2016-11-20 16:40:00 | 2016-11-21 12:40:00 | Icing |
11 | Spd | 2017-01-21 00:00:00 | 2017-01-21 07:10:00 | Icing |
12 | Dir | 2017-01-21 00:00:00 | 2017-01-21 07:10:00 | Icing |
13 | Spd | 2017-01-28 14:10:00 | 2017-01-28 17:30:00 | Icing |
14 | Dir | 2017-01-28 14:10:00 | 2017-01-28 17:30:00 | Icing |
15 | Spd | 2017-10-30 01:40:00 | 2017-10-30 07:00:00 | Icing |
16 | Dir | 2017-10-30 01:40:00 | 2017-10-30 07:00:00 | Icing |
17 | Dir58mS | 2016-12-26 07:00:00 | 2018-02-09 21:30:00 | Invalid |
18 | Dir78mS | 2017-08-11 02:10:00 | 2018-02-09 21:30:00 | Invalid |
19 | Spd80mS | 2017-09-04 00:30:00 | 2018-02-09 21:30:00 | Invalid |
Each row of the cleaning file relates to a portion of the time series which has been flagged for quality reasons.
The ‘Sensor’ column specifies which variables are affected by the issue.
The function looks for all column names in your data that contain the sensor name. Therefore, when the sensor name is just ‘Spd’ it will find ALL column names that contain ‘Spd’ and clean out the data.
The sensor name ‘All’ is a special name which cleans ALL the data for that period.
The ‘Start’ and ‘Stop’ dates specify the length the time period to be removed. The flagged data is inclusive of the ‘start’ time, and ends before the ‘stop’ time.
The ‘Reason’ field justifies why the data should be removed.
2. apply_cleaning_windographer()
with a windographer file:¶
Import some sample data:¶
[11]:
# specify location of existing sample dataset
campbell_data_file_path = r'C:\Users\Stephen\Documents\Analysis\campbell_scientific_demo_data.csv'
# load data as dataframe
campbell_data = bw.load_campbell_scientific(campbell_data_file_path)
# show first few rows of dataframe
campbell_data.head(5)
[11]:
RECORD | Site | LoggerID | Spd80mN | Spd80mS | Spd60mN | Spd60mS | Spd40mN | Spd40mS | Spd80mNStd | ... | Dir78mSStd | Dir58mS | Dir58mSStd | Dir38mS | Dir38mSStd | T2m | RH2m | P2m | PrcpTot | BattMin | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Timestamp | |||||||||||||||||||||
2016-01-09 15:30:00 | 0 | demo_mast | 7000 | 8.370 | 7.911 | 8.160 | 7.849 | 7.857 | 7.626 | 1.240 | ... | 6.100 | 110.1 | 6.009 | 112.2 | 5.724 | 0.711 | 100.0 | 935.0 | 0.0 | 12.94 |
2016-01-09 15:40:00 | 1 | demo_mast | 7000 | 8.250 | 7.961 | 8.100 | 7.884 | 7.952 | 7.840 | 0.897 | ... | 5.114 | 110.9 | 4.702 | 109.8 | 5.628 | 0.630 | 100.0 | 935.0 | 0.0 | 12.95 |
2016-01-09 17:00:00 | 2 | demo_mast | 7000 | 7.652 | 7.545 | 7.671 | 7.551 | 7.531 | 7.457 | 0.756 | ... | 4.172 | 113.1 | 3.447 | 111.8 | 4.016 | 1.126 | 100.0 | 934.0 | 0.0 | 12.75 |
2016-01-09 17:10:00 | 3 | demo_mast | 7000 | 7.382 | 7.325 | 6.818 | 6.689 | 6.252 | 6.174 | 0.844 | ... | 4.680 | 118.8 | 5.107 | 115.6 | 5.189 | 0.954 | 100.0 | 934.0 | 0.0 | 12.71 |
2016-01-09 17:20:00 | 4 | demo_mast | 7000 | 7.977 | 7.791 | 8.110 | 7.915 | 8.140 | 7.974 | 0.556 | ... | 3.123 | 115.9 | 2.960 | 113.6 | 3.540 | 0.863 | 100.0 | 934.0 | 0.0 | 12.69 |
5 rows × 32 columns
Plot monthly means of raw wind speed:¶
[12]:
# plot monthly means of wind speed for each anemometer
bw.monthly_means(campbell_data[anemometers])
[12]:
You can see here that the data is the same as example 1, just loaded in a different format.
Apply cleaning to data based on a pre-existing windographer cleaning file:¶
Cleaning can also be applied to windographer files, by passing inplace=True to the apply_cleaning_windographer function. By default, inplace=False and the cleaned data must be assigned to a new variables, as below:
[13]:
# specify location of associated cleaning file
windog_log_file_path = r'C:\Users\Stephen\Documents\Analysis\windographer_flagging_log.txt'
# apply cleaning
campbell_data_clean = bw.apply_cleaning_windographer(campbell_data, windog_log_file_path)
Plot monthly means of cleaned wind speed:¶
[14]:
# plot monthly means of wind speed for each anemometer
bw.monthly_means(campbell_data_clean[anemometers])
[14]:
As with example 1, the spurious data has been removed.