Dataloaders: Wine Quality¶
-
class
wine_dataloader.
WineQuality
(path, train=True, noise=False, noise_type=None, distribution_data=None, normalize=False, size=None)¶ - Description:
- Wine Quality dataset [1] is a dataset that related to red and white vinho verde wine samples, from the north of Portugal. The goal is to model wine quality based on physicochemical tests.
- Attribute Information:
Input variables (based on physicochemical tests):
1: fixed acidity 2: volatile acidity 3: citric acid 4: residual sugar 5: chlorides 6: free sulfur dioxide 7: total sulfur dioxide 8: density 9: pH 10: sulphates 11: alcohol 12: quality (score between 0 and 10) [Output variable (based on sensory data] - Args:
path (string): A path to the UTKF dataset directory. train (bool): A boolean that controls the selection of training data (True), or testing data (False). noise (bool): A boolean that controls if the noise should be added to the data or not. noise_type (string): A variable that controls the type of the noise. distribution data (list): A list of information that is needed for noise generation. normalize (bool): A boolean that controls if the data will be normalized (True) or not (False). size (int): Size of dataset (training or testing).
-
load_data
()¶ Description:
Loads the dataset.- Return:
- features, labels.
- Return type:
- Tuple
- Args:
- None.
-
get_uniform_params
(mu, v)¶ - Description:
Generates the bounds of the uniform distribution using the mean and the variance, by solving the formula
a = mu - sqrt(3*v) b = mu + sqrt(3*v)
- Return:
- Uniform distribution bounds a and b.
- Return type:
- Tuple.
- Args:
mu (float): The mean of the uniform distribution. v (float): The variance of the uniform distribution.
-
get_gamma_params
(mu, v)¶ - Description:
Generates the shape or concentration (alpha) and rate (beta) using the mean and variance of gamma distribution
alpha = 1/k beta = 1/theta ** k = (mu**2)/v theta = v/mu
- Return:
- Alpha, Beta .
- Return type:
- Tuple.
- Args:
mu (float): The mean of gamma distribution. v (float): The variance of gamma distribution.
-
get_distribution
(dist_type, data, is_params_estimated, vmax=False, vmax_scale=1)¶ - Description:
- Create a probability distribution (uniform or gamma).
- Return:
- A probability distribution.
- Return type:
- Object.
- Args:
dist_type: An argument that specifies the type of the distribution. data: A list that contains the information of distribution . is_params_estimated: An argument that controls if the data is used used to create probability distribution. The data could be distribution statistics (mean and variance) or distribution parameters. vmax: A boolean that controls if maximum heteroscedasticity will be used or not. vmax_scale: An argument that specifies the heteroscedasticity scale.
-
gaussian_noise
(var_dists, p=0.5)¶ - Description:
- Generates gaussian noises with a cenetred mean around 0 and heteroscedasticitical variance that sampled from a range of distributions.
- Return:
- Guassian noises and their heteroscedasticitical variances.
- Return type:
- Tuple.
- Args:
var_dist(object): Noise varaince probability distributions. p (float): The contribution ratio of low and high noise variance distributions.
-
generate_noise
(norm=False)¶ - Description:
- Unpacks information and calls gaussian_noise to generates noises.
- Return:
- Guassian noises and their heteroscedasticitical variances.
- Return type:
- Tuple.
- Args:
norm: Normalization.
References
[1] |
|