Data-Science Hackathon

Problem statement: To predict selling price of the house in a given metro city which consists of 5000 localities

Relevant technical literature related to the problem

Language used: python 3
tool used: Anaconda – jupyter Notebook – Python R enabled
Libraries: scipy , numpy , scikit, pandas, random(Randnteger ,RandFloat, RandCategories)
Have Created a Random Data Set by randomly defining columns

DESIGN: Random technique to be applied By tossing a coin, chose option to enter a feature If its 0(head): update with randomly available options If its 1(tail): update with the default choice (REALISTIC APPROACH)

The dataset is of 5000 rows with 11 columns of different attributes

Number of localities, indexed on localities number
Type of houses C:5 rooms B:7 rooms A:10 rooms
Water source ( 1 for available or 0 for not available)
Hypothetical pollution index(range of 1 to 10)
Built house between (1990-2010)
Educational facilities (1 for available or 0 for not available. Within a range of 3 kilometers)
Hypothetical crime rate (range of 1 to 10)
Percentage of middle class families (on a scale of 100)
Roads(good, bad, moderate)
Average price per each house in INR(Indian Rupees) C:500,000 B:1,000,000 A:2,000,000
Decision to buy (yes or no)

Have created 2 models for training and testing of data. In training model, have selected 80% of the 5000 rows And for testing data 20% of data.

A] TRAINING DATA

Let the randomly generated, recommended to buy = y’ Pass 80% of rows to training model now, recommended to buy= y

If y=y’ => Right_decision y not equal to y’ => Wrong decision

Efficiency/Precision = Right_decision/80% of 5000 rows

B] TESTING DATA

Lets pass 20% of remaining rows to Test Model

Let the randomly generated, recommended to buy = y’ Pass 20% of rows to training model now, recommended to buy= Y

If Y=y’ => Right_decision y not equal to y’ => Wrong decision efficiency/precision = Right_decision/20% of 5000

LINEAR REGRESSION EQUATION is used to decrease the actual cost of the house by 20% depending on the attributes

If new reduced cost < actual cost => BUY If new reduced cost > actual cost => do not BUY

RESULT:

Predicted house type and predicted the selling price of house type based on all above-mentioned features and recommended decision to buy i.e., YES or NO

Training data 80% of 5000 rows Testing data : 20% of remaining rows

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Data_Science_Hackathon.ipynb		Data_Science_Hackathon.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Science Hackathon

About

Releases

Packages

Languages

vharsoor/Data-Science

Folders and files

Latest commit

History

Repository files navigation

Data-Science Hackathon

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages