Document

Project Overview

This project analyzes sugarcane production data across different countries and continents using Python data analysis tools.

Data Loading and Cleaning
- Loaded sugarcane production dataset
- Removed unnecessary index column
- Cleaned numeric data by removing dots and fixing decimal separators
- Handled missing values
- Converted data types to appropriate numeric formats
Exploratory Data Analysis
- Dataset shape: Shows number of rows and columns
- Analyzed continental distribution of sugarcane production
- Created visualizations for key metrics:
  - Production (Tons)
  - Production per Person (Kg)
  - Acreage (Hectare)
  - Yield (Kg/Hectare)
Key Visualizations
- Box plots showing distribution of main metrics
- Histograms with KDE for numeric variables
- Pie chart showing top producers' percentage
- Bar plots for top producing countries
- Heat map for correlation analysis

Continental Distribution
- Visualized number of sugarcane-growing countries per continent
Production Analysis
- Created percentage analysis of top producers
- Identified leading countries in production
Land Usage
- Analyzed countries with highest acreage
- Compared land use vs production
Yield Analysis
- Identified countries with highest yield per hectare
- Examined relationship between land area and production