brightwind.analyse.analyse.dist¶
-
brightwind.analyse.analyse.
dist
(var_series, var_to_bin_against=None, bins=None, bin_labels=None, x_label=None, max_y_value=None, aggregation_method='%frequency', return_data=False)¶ Calculates the distribution of a variable against itself as per the bins specified. Can also pass another variable for finding distribution with respect to another variable.
- Parameters
var_series (pandas.Series) – Time-series of the variable whose distribution we need to find
var_to_bin_against (pandas.Series, None) – (optional) Times-series of the variable which we want to bin against if required to bin against another variable.
bins (list, array, None) – Array of numbers where adjacent elements of array form a bin. If set to None if derives the min and max from the var_to_bin_against series and creates array in steps of 1.
bin_labels (list, array, None) – Labels of bins to be used, uses (bin-start, bin-end] format by default
x_label (str, None) – x-axis label to be used. If None, it will take the name of the series sent.
max_y_value (float, int) – Max value for the y-axis of the plot to be set. Default will be relative to max calculated data value.
aggregation_method (str or function) – Statistical method used to find distribution. It can be mean, max, min, std, count, %frequency or a custom function. Computes frequency in percentages by default.
return_data (bool) – Set to True if you want the data returned.
- Returns
A distribution plot and, if requested, a pandas.Series with bins as row indexes and column with statistics chosen by aggregation_method.
Example usage
import brightwind as bw data = bw.load_csv(bw.datasets.demo_data) #For distribution of %frequency of wind speeds dist = bw.dist(data.Spd40mN, bins=[0, 8, 12, 21], bin_labels=['normal', 'gale', 'storm']) #For distribution of temperature temp_dist = bw.dist(data.T2m) #For distribution of temperature with set bin array temp_dist = bw.dist(data.T2m, bins=[0,1,2,3,4,5,6,7,8,9,10]) #For custom aggregation function def custom_agg(x): return x.mean()+(2*x.std()) temp_dist = bw.dist(data.T2m, bins=[-10, 4, 12, 18, 30], aggregation_method=custom_agg) #For distribution of mean wind speeds with respect to temperature spd_dist = bw.dist(data.Spd40mN, var_to_bin_against=data.T2m, bins=[-10, 4, 12, 18, 30], bin_labels=['freezing', 'cold', 'mild', 'hot'], aggregation_method='mean')