D19IT155 PRACTICLE EXAM

Dataset Description using Orange tool.

perform various data preprocessing tasks like Encoding, Normalization, Missing value handling and Feature Selection on data with help of various Orange functions.

SCREENSHOT OF FIRST TASK FROM ORANGE TOOL FILE


ENCODING

For performing encoding you can use Continuize Discrete Variables option.

  • One feature per value creates columns for each value, place 1 where an instance has that value and 0 where it doesn’t. 

NORMALIZATION 

Normalization is used to scale the data of an attribute so that it falls in a smaller range, such as -1.0 to 1.0 or 0.0 to 1.0. Normalization is generally required when we are dealing with attributes on a different scale, otherwise, it may lead to a dilution in effectiveness of an important equally important attribute(on lower scale) because of other attribute having values on larger scale. We use the Normalize function to perform normalization.

MISSING VALUE HANDLING 


  1. Add average or most frequent 
  2. Replace with random value 
  3. Remove rows with missing values.

FEATURE SELECTION 




feature statistics



RAW DATA AND DATA AFTER DATA PREPROCESSING


RAW DATA

DATA AFTER DATA PREPROCESSING







 

Comments

Popular posts from this blog

Practical-6 Data PreProcessing with Orange Tool