brightwind.analyse.analyse.dist

brightwind.analyse.analyse.dist(var_series, var_to_bin_against=None, bins=None, bin_labels=None, x_label=None, max_y_value=None, aggregation_method='%frequency', return_data=False)

Calculates the distribution of a variable against itself as per the bins specified. Can also pass another variable for finding distribution with respect to another variable.

Parameters
  • var_series (pandas.Series) – Time-series of the variable whose distribution we need to find

  • var_to_bin_against (pandas.Series, None) – (optional) Times-series of the variable which we want to bin against if required to bin against another variable.

  • bins (list, array, None) – Array of numbers where adjacent elements of array form a bin. If set to None if derives the min and max from the var_to_bin_against series and creates array in steps of 1.

  • bin_labels (list, array, None) – Labels of bins to be used, uses (bin-start, bin-end] format by default

  • x_label (str, None) – x-axis label to be used. If None, it will take the name of the series sent.

  • max_y_value (float, int) – Max value for the y-axis of the plot to be set. Default will be relative to max calculated data value.

  • aggregation_method (str or function) – Statistical method used to find distribution. It can be mean, max, min, std, count, %frequency or a custom function. Computes frequency in percentages by default.

  • return_data (bool) – Set to True if you want the data returned.

Returns

A distribution plot and, if requested, a pandas.Series with bins as row indexes and column with statistics chosen by aggregation_method.

Example usage

import brightwind as bw
data = bw.load_csv(bw.datasets.demo_data)

#For distribution of %frequency of wind speeds
dist = bw.dist(data.Spd40mN, bins=[0, 8, 12, 21], bin_labels=['normal', 'gale', 'storm'])

#For distribution of temperature
temp_dist = bw.dist(data.T2m)

#For distribution of temperature with set bin array
temp_dist = bw.dist(data.T2m, bins=[0,1,2,3,4,5,6,7,8,9,10])

#For custom aggregation function
def custom_agg(x):
    return x.mean()+(2*x.std())
temp_dist = bw.dist(data.T2m, bins=[-10, 4, 12, 18, 30], aggregation_method=custom_agg)

#For distribution of mean wind speeds with respect to temperature
spd_dist = bw.dist(data.Spd40mN, var_to_bin_against=data.T2m,
                   bins=[-10, 4, 12, 18, 30],
                   bin_labels=['freezing', 'cold', 'mild', 'hot'], aggregation_method='mean')