Dataset-1 is a small financial dataset. It contains 31 features, all of which I engineered thinking they could be useful. In practice, I’ve found only 5 of the features to be useful. It will be interesting to see if some regression methods can automatically disregard the useless features, or if I have to also test feature selection methods. The features are not orthogonal.
The training set has 43,266 rows, and the test set has 19,588 rows. Here are histograms of the feature values and target values from the training set: