{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Focused Analysis of Goal Management Categories (Flag 82)\n",
    "\n",
    "### Dataset Overview\n",
    "This dataset comprises 500 simulated records from the ServiceNow `sn_gf_goal` table, detailing various aspects related to organizational goals. Key attributes include goal status, assigned owner, department affiliation, start and end dates, and comprehensive descriptions. The dataset also features metrics like priority level, percentage completed, and target achievement percentage. It primarily focuses on tracking and managing both departmental and individual goals, providing insights into the effectiveness of these goals and their alignment with overarching organizational strategies. Additionally, the dataset logs updates for each goal, offering a historical view of changes and the identities of those making these updates.\n",
    "\n",
    "### Your Objective\n",
    "**Objective**: Investigate the unexpectedly high success rates of Low and Medium priority 'Cost Optimization' goals, and utilize these findings to enhance goal management efficiency across all goal categories.\n",
    "\n",
    "**Role**: Operational Effectiveness Analyst\n",
    "\n",
    "**Challenge Level**: 2 out of 5. This task requires adept data manipulation and interpretation skills to uncover underlying patterns and develop actionable strategies, making it a challenging yet rewarding analysis.\n",
    "\n",
    "**Category**: Goal Management"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import Necessary Libraries\n",
    "This cell imports all necessary libraries required for the analysis. This includes libraries for data manipulation, data visualization, and any specific utilities needed for the tasks. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import argparse\n",
    "import pandas as pd\n",
    "import json\n",
    "import requests\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import seaborn as sns\n",
    "from pandas import date_range"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Dataset\n",
    "This cell loads the dataset used for the analysis. The goal dataset is stored in a CSV file and is loaded into a DataFrame. This step includes reading the data from a file path and possibly performing initial observations such as viewing the first few rows to ensure it has loaded correctly.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>category</th>\n",
       "      <th>state</th>\n",
       "      <th>closed_at</th>\n",
       "      <th>opened_at</th>\n",
       "      <th>closed_by</th>\n",
       "      <th>number</th>\n",
       "      <th>sys_updated_by</th>\n",
       "      <th>location</th>\n",
       "      <th>assigned_to</th>\n",
       "      <th>caller_id</th>\n",
       "      <th>sys_updated_on</th>\n",
       "      <th>short_description</th>\n",
       "      <th>priority</th>\n",
       "      <th>assignement_group</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Database</td>\n",
       "      <td>Closed</td>\n",
       "      <td>2023-07-25 03:32:18.462401146</td>\n",
       "      <td>2023-01-02 11:04:00</td>\n",
       "      <td>Fred Luddy</td>\n",
       "      <td>INC0000000034</td>\n",
       "      <td>admin</td>\n",
       "      <td>Australia</td>\n",
       "      <td>Fred Luddy</td>\n",
       "      <td>ITIL User</td>\n",
       "      <td>2023-07-06 03:31:13.838619495</td>\n",
       "      <td>There was an issue</td>\n",
       "      <td>2 - High</td>\n",
       "      <td>Database</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Hardware</td>\n",
       "      <td>Closed</td>\n",
       "      <td>2023-03-11 13:42:59.511508874</td>\n",
       "      <td>2023-01-03 10:19:00</td>\n",
       "      <td>Charlie Whitherspoon</td>\n",
       "      <td>INC0000000025</td>\n",
       "      <td>admin</td>\n",
       "      <td>India</td>\n",
       "      <td>Beth Anglin</td>\n",
       "      <td>Don Goodliffe</td>\n",
       "      <td>2023-05-19 04:22:50.443252112</td>\n",
       "      <td>There was an issue</td>\n",
       "      <td>1 - Critical</td>\n",
       "      <td>Hardware</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Database</td>\n",
       "      <td>Resolved</td>\n",
       "      <td>2023-01-20 14:37:18.361510788</td>\n",
       "      <td>2023-01-04 06:37:00</td>\n",
       "      <td>Charlie Whitherspoon</td>\n",
       "      <td>INC0000000354</td>\n",
       "      <td>system</td>\n",
       "      <td>India</td>\n",
       "      <td>Fred Luddy</td>\n",
       "      <td>ITIL User</td>\n",
       "      <td>2023-02-13 08:10:20.378839709</td>\n",
       "      <td>There was an issue</td>\n",
       "      <td>2 - High</td>\n",
       "      <td>Database</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hardware</td>\n",
       "      <td>Resolved</td>\n",
       "      <td>2023-01-25 20:46:13.679914432</td>\n",
       "      <td>2023-01-04 06:53:00</td>\n",
       "      <td>Fred Luddy</td>\n",
       "      <td>INC0000000023</td>\n",
       "      <td>admin</td>\n",
       "      <td>Canada</td>\n",
       "      <td>Luke Wilson</td>\n",
       "      <td>Don Goodliffe</td>\n",
       "      <td>2023-06-14 11:45:24.784548040</td>\n",
       "      <td>There was an issue</td>\n",
       "      <td>2 - High</td>\n",
       "      <td>Hardware</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Hardware</td>\n",
       "      <td>Closed</td>\n",
       "      <td>2023-05-10 22:35:58.881919516</td>\n",
       "      <td>2023-01-05 16:52:00</td>\n",
       "      <td>Luke Wilson</td>\n",
       "      <td>INC0000000459</td>\n",
       "      <td>employee</td>\n",
       "      <td>UK</td>\n",
       "      <td>Charlie Whitherspoon</td>\n",
       "      <td>David Loo</td>\n",
       "      <td>2023-06-11 20:25:35.094482408</td>\n",
       "      <td>There was an issue</td>\n",
       "      <td>2 - High</td>\n",
       "      <td>Hardware</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   category     state                      closed_at            opened_at  \\\n",
       "0  Database    Closed  2023-07-25 03:32:18.462401146  2023-01-02 11:04:00   \n",
       "1  Hardware    Closed  2023-03-11 13:42:59.511508874  2023-01-03 10:19:00   \n",
       "2  Database  Resolved  2023-01-20 14:37:18.361510788  2023-01-04 06:37:00   \n",
       "3  Hardware  Resolved  2023-01-25 20:46:13.679914432  2023-01-04 06:53:00   \n",
       "4  Hardware    Closed  2023-05-10 22:35:58.881919516  2023-01-05 16:52:00   \n",
       "\n",
       "              closed_by         number sys_updated_by   location  \\\n",
       "0            Fred Luddy  INC0000000034          admin  Australia   \n",
       "1  Charlie Whitherspoon  INC0000000025          admin      India   \n",
       "2  Charlie Whitherspoon  INC0000000354         system      India   \n",
       "3            Fred Luddy  INC0000000023          admin     Canada   \n",
       "4           Luke Wilson  INC0000000459       employee         UK   \n",
       "\n",
       "            assigned_to      caller_id                 sys_updated_on  \\\n",
       "0            Fred Luddy      ITIL User  2023-07-06 03:31:13.838619495   \n",
       "1           Beth Anglin  Don Goodliffe  2023-05-19 04:22:50.443252112   \n",
       "2            Fred Luddy      ITIL User  2023-02-13 08:10:20.378839709   \n",
       "3           Luke Wilson  Don Goodliffe  2023-06-14 11:45:24.784548040   \n",
       "4  Charlie Whitherspoon      David Loo  2023-06-11 20:25:35.094482408   \n",
       "\n",
       "    short_description      priority assignement_group  \n",
       "0  There was an issue      2 - High          Database  \n",
       "1  There was an issue  1 - Critical          Hardware  \n",
       "2  There was an issue      2 - High          Database  \n",
       "3  There was an issue      2 - High          Hardware  \n",
       "4  There was an issue      2 - High          Hardware  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dataset_path = \"csvs/flag-82.csv\"\n",
    "goal_data = pd.read_csv(dataset_path)\n",
    "df = pd.read_csv(dataset_path)\n",
    "goal_data.head()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Question 1: How does the success rate of goals met across different categories compare?**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot percentage of target goals achieved by category\n",
    "\n",
    "This plot visualizes the percentage of target goals achieved across different categories or topics, providing  insight into the success rate of goal management. This helps in identifying which kind of goals are succeeding at meeting and which areas or categories improvements might be necessary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N/A\n"
     ]
    }
   ],
   "source": [
    "# import matplotlib.pyplot as plt\n",
    "# import seaborn as sns\n",
    "# import pandas as pd\n",
    "# import numpy as np\n",
    "\n",
    "# # Assuming 'goal_data' is the DataFrame created from the previous code\n",
    "\n",
    "# # Calculate if each goal met its target percentage\n",
    "# goal_data['goal_met'] = goal_data.apply(lambda row: row['percent_complete'] >= row['target_percentage'], axis=1)\n",
    "\n",
    "# # Group by department and calculate the percentage of goals met\n",
    "# department_goal_achievement = goal_data.groupby('category')['goal_met'].mean() * 100\n",
    "\n",
    "# # Reset index to turn the series into a DataFrame\n",
    "# department_goal_achievement = department_goal_achievement.reset_index()\n",
    "\n",
    "# # Rename columns for better readability in the plot\n",
    "# department_goal_achievement.columns = ['Category', 'Percentage of Goals Met']\n",
    "\n",
    "# # Create a bar plot\n",
    "# plt.figure(figsize=(10, 6))\n",
    "# bar_plot = sns.barplot(x='Category', y='Percentage of Goals Met', data=department_goal_achievement, palette='viridis')\n",
    "# plt.title('Percentage of Target Goals Achieved in a Category')\n",
    "# plt.xlabel('Category')\n",
    "# plt.ylabel('Percentage of Goals Met')\n",
    "# plt.ylim(0, 100)  # Set y-axis limits to make differences more evident\n",
    "# for p in bar_plot.patches:\n",
    "#     bar_plot.annotate(format(p.get_height(), '.0f'), \n",
    "#                       (p.get_x() + p.get_width() / 2., p.get_height()), \n",
    "#                       ha = 'center', va = 'center', \n",
    "#                       xytext = (0, 9), \n",
    "#                       textcoords = 'offset points')\n",
    "# plt.show()\n",
    "\n",
    "print(\"N/A\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generate JSON Description for the Insight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'data_type': 'comparative',\n",
       " 'insight': 'There was no column percent_complete to conduct any analysis',\n",
       " 'insight_value': {},\n",
       " 'plot': {'description': 'The graph could not be generated due to missing data'},\n",
       " 'question': 'How does the success rate of goals met across different categories compare?',\n",
       " 'actionable_insight': 'No actionable insight could be generated due to missing data'}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{\n",
    "\t\"data_type\": \"comparative\",\n",
    "\t\"insight\": \"There was no column percent_complete to conduct any analysis\",\n",
    "\t\"insight_value\": {\n",
    "\t},\n",
    "\t\"plot\": {\n",
    "    \t\"description\": \"The graph could not be generated due to missing data\",\n",
    "\t},\n",
    "\t\"question\": \"How does the success rate of goals met across different categories compare?\",\n",
    "\t\"actionable_insight\": \"No actionable insight could be generated due to missing data\"\n",
    "}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Question 2:** How do cross-departmental tasks compare to non-cross-departmental tasks in terms of completion and target achievement percentages?\n",
    "\n",
    "This plot illustrates the average completion and target achievement percentages for tasks classified as cross-departmental versus non-cross-departmental. By comparing these two categories, we can assess the impact of cross-departmental collaboration on task performance and goal attainment. The plot shows that cross-departmental tasks tend to have higher percentages in both completion and target achievement, suggesting the benefits of collaborative efforts across departments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N/A\n"
     ]
    }
   ],
   "source": [
    "# # Define a list of keywords that might suggest cross-departmental goals\n",
    "# cross_dept_keywords = [\"collaborate\", \"joint\", \"integration\", \"cross-departmental\", \"partnership\"]\n",
    "\n",
    "# # Function to check if a description suggests cross-departmental goals\n",
    "# def is_cross_departmental(description):\n",
    "#     return any(keyword in description.lower() for keyword in cross_dept_keywords)\n",
    "\n",
    "# # Apply the function to create a new column indicating cross-departmental goals\n",
    "# df['is_cross_departmental'] = df['description'].apply(is_cross_departmental)\n",
    "\n",
    "# # Calculate the average percent_complete and target_percentage for cross-departmental and non-cross-departmental tasks\n",
    "# avg_data = df.groupby('is_cross_departmental').agg({\n",
    "#     'percent_complete': 'mean',\n",
    "#     'target_percentage': 'mean'\n",
    "# }).reset_index()\n",
    "\n",
    "# # Rename the values for clarity\n",
    "# avg_data['is_cross_departmental'] = avg_data['is_cross_departmental'].map({True: 'Cross-Departmental', False: 'Non-Cross-Departmental'})\n",
    "\n",
    "# # Plot the average percent_complete and target_percentage in a single bar plot\n",
    "# plt.figure(figsize=(14, 7))\n",
    "# barplot = sns.barplot(x='is_cross_departmental', y='value', hue='variable', \n",
    "#                       data=pd.melt(avg_data, id_vars='is_cross_departmental', value_vars=['percent_complete', 'target_percentage']),\n",
    "#                       palette='coolwarm')\n",
    "\n",
    "# # Annotate the bars with the actual values\n",
    "# for p in barplot.patches:\n",
    "#     barplot.annotate(f'{p.get_height():.2f}%', \n",
    "#                      (p.get_x() + p.get_width() / 2., p.get_height()), \n",
    "#                      ha='center', va='center', \n",
    "#                      xytext=(0, 10), \n",
    "#                      textcoords='offset points',\n",
    "#                      fontweight='bold')\n",
    "\n",
    "# plt.title('Average Completion and Target Percentage: Cross-Departmental vs Non-Cross-Departmental Tasks')\n",
    "# plt.xlabel('Task Type')\n",
    "# plt.ylabel('Percentage')\n",
    "# plt.ylim(0, 100)\n",
    "# plt.legend(title='Metric', loc='upper left')\n",
    "# plt.grid(True, axis='y', linestyle='--', alpha=0.7)\n",
    "# plt.show()\n",
    "\n",
    "print(\"N/A\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'data_type': 'Diagnostic',\n",
       " 'insight': 'There was no column description to conduct any analysis',\n",
       " 'insight_value': {},\n",
       " 'plot': {'description': 'The graph could not be generated due to missing data'},\n",
       " 'question': 'How do cross-departmental tasks perform in terms of completion and target achievement compared to non-cross-departmental tasks?',\n",
       " 'actionable_insight': 'No actionable insight could be generated due to missing data'}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{\n",
    "\t\"data_type\": \"Diagnostic\",\n",
    "\t\"insight\": \"There was no column description to conduct any analysis\",\n",
    "\t\"insight_value\": {\n",
    "\t},\n",
    "\t\"plot\": {\n",
    "    \t\"description\": \"The graph could not be generated due to missing data\",\n",
    "\t},\n",
    "\t\"question\": \"How do cross-departmental tasks perform in terms of completion and target achievement compared to non-cross-departmental tasks?\",\n",
    "\t\"actionable_insight\": \"No actionable insight could be generated due to missing data\"\n",
    "}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Question 3:** How are 'Cost Reduction' goals distributed by priority compared to goals?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot proportion of successful goals by priority in Cost Reduction category\n",
    "\n",
    "This bar plot depicts the success rates of goals within the Cost Reduction category, categorized by their priority levels: Critical, High, Medium, and Low. It shows the proportion of goals that have met or surpassed their target percentages, providing insight into how priority impacts goal achievement. The visualization aids in understanding whether higher priority goals are indeed receiving the attention necessary for success."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N/A\n"
     ]
    }
   ],
   "source": [
    "# import matplotlib.pyplot as plt\n",
    "# import seaborn as sns\n",
    "\n",
    "# # Filter the data for the IT department\n",
    "# it_goals = goal_data[goal_data['category'] == 'Cost Reduction']\n",
    "\n",
    "# # Define successful goals (assuming successful means percent_complete >= target_percentage)\n",
    "# it_goals['is_successful'] = it_goals['percent_complete'] >= it_goals['target_percentage']\n",
    "\n",
    "# # Calculate the proportion of successful goals by priority\n",
    "# success_rates = it_goals.groupby('priority')['is_successful'].mean()\n",
    "\n",
    "# # Convert the series to a DataFrame for plotting\n",
    "# success_rates_df = success_rates.reset_index()\n",
    "\n",
    "# # Plotting\n",
    "# plt.figure(figsize=(10, 6))\n",
    "# bar_plot = sns.barplot(x='priority', y='is_successful', data=success_rates_df, order=['Critical', 'High', 'Medium', 'Low'])\n",
    "# plt.title('Proportion of Successful Goals by Priority in Cost reduction Category')\n",
    "# plt.xlabel('Priority')\n",
    "# plt.ylabel('Proportion of Successful Goals')\n",
    "# plt.ylim(0, 1)  # Set the limit to show proportions from 0 to 1\n",
    "# for p in bar_plot.patches:\n",
    "#     bar_plot.annotate(format(p.get_height(), '.1%'),  # Format as a percentage with one decimal\n",
    "#                       (p.get_x() + p.get_width() / 2., p.get_height()),\n",
    "#                       ha='center', va='center', \n",
    "#                       xytext=(0, 9), \n",
    "#                       textcoords='offset points')\n",
    "# plt.show()\n",
    "\n",
    "print(\"N/A\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generate JSON Description for the Insight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'data_type': 'descriptive',\n",
       " 'insight': 'There was no column percent_complete to conduct any analysis',\n",
       " 'insight_value': {},\n",
       " 'plot': {'description': 'The graph could not be generated due to missing data'},\n",
       " 'question': \"How are 'Cost Reduction' goals distributed by priority compared to goals in other categories?\",\n",
       " 'actionable_insight': 'No actionable insight could be generated due to missing data'}"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{\n",
    "\t\"data_type\": \"descriptive\",\n",
    "\t\"insight\": \"There was no column percent_complete to conduct any analysis\",\n",
    "\t\"insight_value\": {\n",
    "\t},\n",
    "\t\"plot\": {\n",
    "    \t\"description\": \"The graph could not be generated due to missing data\",\n",
    "\t},\n",
    "\t\"question\": \"How are 'Cost Reduction' goals distributed by priority compared to goals in other categories?\",\n",
    "\t\"actionable_insight\": \"No actionable insight could be generated due to missing data\"\n",
    "}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Question 4:** Is this unusual trend of low and medium priority goals seen in the Cost Reduction category also observed across other categories??"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot proportion of successful goals by priority across categories\n",
    "\n",
    "This bar plot provides a comparative analysis of the success rates of goals by priority levels (Critical, High, Medium, Low) across different category of goals. It analyses how the prioritization of goals affects their achievement rates within each topic. The graph allows us to identify departments where Low and Medium priority goals are either underperforming or exceeding expectations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N/A\n"
     ]
    }
   ],
   "source": [
    "# import matplotlib.pyplot as plt\n",
    "# import seaborn as sns\n",
    "\n",
    "# # Define successful goals (assuming successful means percent_complete >= target_percentage)\n",
    "# goal_data['is_successful'] = goal_data['percent_complete'] >= goal_data['target_percentage']\n",
    "\n",
    "# # Calculate the proportion of successful goals by priority and department\n",
    "# success_rates = goal_data.groupby(['category', 'priority'])['is_successful'].mean().reset_index()\n",
    "\n",
    "# # Plotting\n",
    "# plt.figure(figsize=(14, 8))\n",
    "# barplot = sns.barplot(x='category', y='is_successful', hue='priority', data=success_rates, hue_order=['Critical', 'High', 'Medium', 'Low'])\n",
    "\n",
    "# # Annotate each bar\n",
    "# for p in barplot.patches:\n",
    "#     barplot.annotate(format(p.get_height(), '.2f'),  # format as a percentage\n",
    "#                      (p.get_x() + p.get_width() / 2., p.get_height()),\n",
    "#                      ha = 'center', va = 'center',\n",
    "#                      size=9,\n",
    "#                      xytext = (0, 5),\n",
    "#                      textcoords = 'offset points')\n",
    "\n",
    "# plt.title('Proportion of Successful Goals by Priority Across categoriess')\n",
    "# plt.xlabel('Category')\n",
    "# plt.ylabel('Proportion of Successful Goals')\n",
    "# plt.ylim(0, 1)  # Set the limit to show proportions from 0 to 1\n",
    "# plt.legend(title='Priority')\n",
    "# plt.show()\n",
    "\n",
    "print(\"N/A\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generate JSON Description for the Insight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'data_type': 'diagnostic',\n",
       " 'insight': 'There was no column percent_complete to conduct any analysis',\n",
       " 'insight_value': {},\n",
       " 'plot': {'description': 'The graph could not be generated due to missing data'},\n",
       " 'question': 'Is this unusual trend of low and medium priority goals seen in the Cost Reduction category also observed across other categories?',\n",
       " 'actionable_insight': 'No actionable insight could be generated due to missing data'}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{\n",
    "\t\"data_type\": \"diagnostic\",\n",
    "\t\"insight\": \"There was no column percent_complete to conduct any analysis\",\n",
    "\t\"insight_value\": {\n",
    "\t},\n",
    "\t\"plot\": {\n",
    "    \t\"description\": \"The graph could not be generated due to missing data\",\n",
    "\t},\n",
    "\t\"question\": \"Is this unusual trend of low and medium priority goals seen in the Cost Reduction category also observed across other categories?\",\n",
    "\t\"actionable_insight\": \"No actionable insight could be generated due to missing data\"\n",
    "}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Question 5:** What is the distribution of Low and Medium priority goals in Cost Reduction versus other categories?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot distribution of Low and Medium priority goals in Cost Reduction vs other categories\n",
    "\n",
    "This bar graph illustrates the distribution of goals classified as Low or Medium priority within the Cost Reduction categories compared to other categories. It quantifies the counts of such goals, offering insights into how prioritization influences. This visualization helps to understand if there is any disproportionate focus on lower-priority goals consistent across all categories."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N/A\n"
     ]
    }
   ],
   "source": [
    "# import matplotlib.pyplot as plt\n",
    "# import seaborn as sns\n",
    "# import pandas as pd\n",
    "\n",
    "# # Assume 'goal_data' is your DataFrame and already loaded\n",
    "\n",
    "# # Filter the data to include only Critical and High priority goals\n",
    "# filtered_goals = goal_data[goal_data['priority'].isin(['Low', 'Medium'])]\n",
    "\n",
    "# # Create a new column 'IT_or_Other' to distinguish between IT and other departments\n",
    "# filtered_goals['CR_or_Other'] = filtered_goals['category'].apply(lambda x: 'Cost Reduction' if x == 'Cost Reduction' else 'Other')\n",
    "\n",
    "# # Count the number of goals in each category\n",
    "# priority_counts = filtered_goals.groupby(['CR_or_Other', 'priority']).size().reset_index(name='counts')\n",
    "\n",
    "# # Plotting\n",
    "# plt.figure(figsize=(10, 6))\n",
    "# bar_plot = sns.barplot(x='CR_or_Other', y='counts', hue='priority', data=priority_counts)\n",
    "# plt.title('Distribution of Low and Medium Priority Goals: Cost Reduction vs. Other Categories')\n",
    "# plt.xlabel('Category')\n",
    "# plt.ylabel('Number of Goals')\n",
    "# plt.legend(title='Priority')\n",
    "\n",
    "# # Annotate bars with the count of goals\n",
    "# for p in bar_plot.patches:\n",
    "#     bar_plot.annotate(format(p.get_height(), '.0f'), \n",
    "#                       (p.get_x() + p.get_width() / 2., p.get_height()), \n",
    "#                       ha='center', va='center', \n",
    "#                       xytext=(0, 9), \n",
    "#                       textcoords='offset points')\n",
    "\n",
    "# plt.show()\n",
    "\n",
    "print(\"N/A\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generate JSON Description for the Insight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'data_type': 'diagnostic',\n",
       " 'insight': \"Filtering operation resulted in an empty DataFrame, as no records matched the specified priority levels ('Low' or 'Medium') and data was available to conduct any analysis\",\n",
       " 'insight_value': {},\n",
       " 'plot': {'description': 'The graph could not be generated due to missing data'},\n",
       " 'question': 'What is the distribution of Low and Medium priority goals in Cost Reduction versus other categories?',\n",
       " 'actionable_insight': 'No actionable insight could be generated due to missing data'}"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{\n",
    "\t\"data_type\": \"diagnostic\",\n",
    "\t\"insight\": \"No data is available to conduct any analysis\",\n",
    "\t\"insight_value\": {\n",
    "\t},\n",
    "\t\"plot\": {\n",
    "    \t\"description\": \"The graph could not be generated due to missing data\",\n",
    "\t},\n",
    "\t\"question\": \"What is the distribution of Low and Medium priority goals in Cost Reduction versus other categories?\",\n",
    "\t\"actionable_insight\": \"No actionable insight could be generated due to missing data\"\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary of Findings (Flag 82)\n",
    "\n",
    "1. **Comparison of Success Rates Across Categories**: The lack of a `percent_complete` column prevents the analysis of success rates for goals met across different categories, limiting insights into the effectiveness of goal achievement across the organization.\n",
    "\n",
    "2. **Performance of Cross-Departmental Tasks**: Without the `description` column, it is impossible to compare the performance of cross-departmental tasks in terms of completion and target achievement against non-cross-departmental tasks, obscuring potential collaboration impacts.\n",
    "\n",
    "3. **Distribution of 'Cost Reduction' Goals by Priority**: The absence of the `percent_complete` column restricts the ability to analyze how 'Cost Reduction' goals are distributed by priority compared to goals in other categories, limiting understanding of focus areas and potential resource allocation.\n",
    "\n",
    "4. **Trends in Priority Goals Across Categories**: The lack of a `percent_complete` column inhibits the examination of whether the observed trend of low and medium priority goals in the Cost Reduction category is reflected in other categories, hampering the identification of systemic issues."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
