{
    "dataset_csv_path": "data/notebooks/csvs/flag-65.csv",
    "user_dataset_csv_path": "data/notebooks/csvs/flag-65-sysuser.csv",
    "metadata": {
        "goal": "To explore and understand the correlation between asset purchase dates, warranty periods, and employee start dates to optimize asset procurement and warranty management.",
        "role": "Procurement Analyst",
        "category": "Asset Management & User Management",
        "dataset_description": "The dataset comprises two key tables simulating ServiceNow platform: the `sys_user` table and the `alm_hardware` table with 500 entries. The assets alm_hardware table, consists of detailing key attributes of hardware assets such as asset tags, display names, configuration item (CI) identifiers, serial numbers, and model categories. It includes information on asset assignment, last update timestamps, cost, departmental allocation, and warranty expiration dates. The `sys_user` table, on the other hand, contains user-related information, focusing on the details of employees or system users. This includes fields such as 'user_id', 'name', 'role', 'email', 'start_date', and 'department', providing insights into the user profiles managed within the ServiceNow system.",
        "header": "Asset Warranty and Purchase Date Analysis (Flag 65)"
    },
    "insight_list": [
        {
            "insight": "Product Management allocates the highest average spending on assets compared to other departments.",
            "question": "How do asset costs differ between departments, and which department has the highest average spending on assets?",
            "code": "import matplotlib.pyplot as plt\nimport pandas as pd\n\n# Load the data\nflag_data = pd.read_csv(\"csvs/flag-65.csv\")\n\n# Calculate the average asset cost per department\navg_cost_per_department = (\n    flag_data.groupby(\"department\")[\"cost\"]\n    .mean()\n    .sort_values(ascending=False)\n    .reset_index()\n)\n\n# Plot the results\nplt.figure(figsize=(10, 6))\nplt.bar(\n    avg_cost_per_department[\"department\"],\n    avg_cost_per_department[\"cost\"],\n    color=\"skyblue\",\n    edgecolor=\"black\",\n)\nplt.title(\"Average Asset Cost per Department\")\nplt.xlabel(\"Department\")\nplt.ylabel(\"Average Cost of Assets\")\nplt.xticks(rotation=45, ha=\"right\")\nplt.grid(axis=\"y\", linestyle=\"--\", alpha=0.7)\n\nplt.show()"
        },
        {
            "data_type": "predictive",
            "insight": "The Linear Regression Model is able to predicts Warranty Periods Based on Purchase Dates",
            "insight_value": {
                "description": "The linear regression analysis confirms a predictable relationship between asset purchase dates and warranty periods, with a trend indicating longer warranties for more recently purchased assets."
            },
            "plot": {
                "plot_type": "regression",
                "title": "Linear Regression of Warranty Periods Against Purchase Dates",
                "x_axis": {
                    "name": "Purchase Date",
                    "value": "Date range from earliest to most recent purchases",
                    "description": "This axis represents the chronological order of asset purchases."
                },
                "y_axis": {
                    "name": "Warranty Period (years)",
                    "value": "Continuously variable warranty durations",
                    "description": "This axis plots the warranty periods, with the regression line illustrating the linear trend."
                },
                "description": "The regression plot effectively shows a clear linear trend, indicating that newer assets tend to have longer warranties. The presence of noise suggests variability around the trend line, which could be due to factors such as different asset types or supplier agreements."
            },
            "question": "Is it a linear trend and can it be regressed with noise?",
            "actionable_insight": "Given the predictability of warranty periods based on purchase dates as evidenced by the linear regression model, the organization can anticipate warranty terms for future purchases. This foresight could be instrumental in negotiating terms with suppliers or choosing products that offer the best value in terms of warranty coverage. Further, by understanding the variability (noise) around the trend, procurement managers can refine their asset management strategies to account for exceptions and ensure robust handling of warranty terms.",
            "code": "# Assuming 'df' is the DataFrame containing your data\ndf[\"warranty_expiration\"] = pd.to_datetime(df[\"warranty_expiration\"])\ndf[\"purchased_on\"] = pd.to_datetime(df[\"purchased_on\"])\n# Calculate the warranty period in years\ndf['warranty_period_years'] = (df['warranty_expiration'] - df['purchased_on']).dt.days / 365\n\n# Create the scatter plot\nplt.figure(figsize=(10, 6))\nplt.scatter(df['purchased_on'], df['warranty_period_years'], alpha=0.6, edgecolors='w', color='blue')\nplt.title('Correlation between purchased date and Warranty Period')\nplt.xlabel('Purchased On Date')\nplt.ylabel('Warranty Period (Years)')\nplt.grid(True)\n# Optionally, you can fit a linear regression line to emphasize the trend\n# Using numpy for linear regression line\nimport numpy as np\n# Convert dates to ordinal for regression\ndf['sys_updated_on_ordinal'] = df['purchased_on'].apply(lambda x: x.toordinal())\n# Fit the regression\nfit = np.polyfit(df['sys_updated_on_ordinal'], df['warranty_period_years'], 1)\nfit_fn = np.poly1d(fit)\n# Plot the regression line\nplt.plot(df['purchased_on'], fit_fn(df['sys_updated_on_ordinal']), color='red', linewidth=2)"
        },
        {
            "insight": "The Product Management and Sales departments have the highest average costs for assets nearing warranty expiration.",
            "question": "What is the average cost of assets nearing warranty expiration in the next 6 months across different departments?",
            "code": "import matplotlib.pyplot as plt\nfrom datetime import datetime, timedelta\n\n# Convert relevant date columns to datetime format\ndf[\"warranty_expiration\"] = pd.to_datetime(\n    df[\"warranty_expiration\"], errors=\"coerce\"\n)\ndf[\"purchased_on\"] = pd.to_datetime(df[\"purchased_on\"], errors=\"coerce\")\n\n# Filter data for assets with warranties expiring in the next 6 months\ncurrent_date = datetime.now()\nsix_months_later = current_date + timedelta(days=180)\nexpiring_assets = df[\n    (df[\"warranty_expiration\"] <= six_months_later)\n    & (df[\"warranty_expiration\"] >= current_date)\n]\n\n# Calculate the average cost of expiring assets per department\navg_cost_per_department = (\n    expiring_assets.groupby(\"department\")[\"cost\"].mean().reset_index()\n)\n\n# Plotting using bar plot\nplt.figure(figsize=(10, 6))\nplt.bar(\n    avg_cost_per_department[\"department\"],\n    avg_cost_per_department[\"cost\"],\n    alpha=0.7,\n    edgecolor=\"black\",\n)\nplt.title(\n    \"Average Cost of Assets with Warranty Expiring in Next 6 Months by Department\"\n)\nplt.xlabel(\"Department\")\nplt.ylabel(\"Average Cost ($)\")\nplt.xticks(rotation=45, ha=\"right\")\nplt.tight_layout()\nplt.show()"
        }
    ],
    "insights": [
        "Product Management allocates the highest average spending on assets compared to other departments.",
        "The Linear Regression Model is able to predicts Warranty Periods Based on Purchase Dates",
        "The Product Management and Sales departments have the highest average costs for assets nearing warranty expiration."
    ],
    "summary": "\n\n1. **Departmental Asset Spending**: From Question 1, it was found that Product Management allocates the highest average spending on assets, followed by Sales and Finance. This insight highlights the need for strategic budget allocation to align with departmental priorities and asset-intensive operations.\n\n2. **Predictability of Warranty Periods**: Question 2 revealed a linear trend between asset purchase dates and warranty periods, suggesting that newer assets tend to have longer warranties. This predictability can guide procurement strategies to secure favorable warranty terms, optimizing asset lifecycle management.\n\n3. **Asset Costs Nearing Warranty Expiration**: As per Question 3, Product Management and Sales departments have the highest average costs for assets nearing warranty expiration. This indicates a potential need for budget prioritization in these departments to manage upcoming asset replacements effectively."
}