Raw dataset: Online Retail.xlsx
Purpose | Skills | Dataset | Notebook | Generated New Dataset |
---|---|---|---|---|
Data preprocessing and understanding the data | Data cleaning, feature engineering, EDA, and generate category with AI | Online Retail.xlsx | EDA-Online-Retail.ipynb |
cleaned_Online_Retail.csv |
Explore UK products | Product analysis and profitability insights | cleaned_Online_Retail.csv |
products_return.ipynb |
UK_return_rate.csv |
Explore customer data | Customer retention analysis, Generating E-commerce metrics, RFM analysis, and Statistical analysis | cleaned_Online_Retail.csv |
Customer_Analysis.ipynb |
rmf.csv , cohort_data.csv |
Explore customer behavior | Churn analysis, price analysis, Customer Segmentation (with K-Means) | rmf.csv , cohort_data.csv |
customer_behaviour_analysis.ipynb |
customer_behaviour.csv |
Note: I have utilized both OpenAI and Choq (with the Llama3-8B-8192 model) to generate a new category column based on the text data. While OpenAI provides more accurate results, it operates on a paid-per-prompt basis. Choq offers a limited number of free prompts but also requires payment for extended usage. The code for generating the new category column has been included in the EDA-Online-Retail.ipynb notebook. If you'd like to generate the data yourself, you can insert your API key and run the code.
Business insight are listed inside the ipynb file.