{
    "dataset_csv_path": "data/notebooks/csvs/flag-14.csv",
    "user_dataset_csv_path": "data/notebooks/csvs/flag-14-sysuser.csv",
    "metadata": {
        "goal": "Given the user and incidents tabular data, analyze discrepancies in the performance of human agents, particularly focusing on a certain human agent, to understand the underlying causes of performance dips and devise strategies to manage workload during absence periods.",
        "role": "Team Lead",
        "category": "User Management",
        "dataset_description": "The dataset comprises two key tables simulating ServiceNow platform: the `sys_user` table and the `incidents` table, each with 500 entries. The `incidents` table focuses on detailing various attributes such as category, state, open and close dates, involved personnel, and incident specifics like location, description, and priority. It effectively captures incident management activities with fields including 'opened_at', 'closed_at', 'assigned_to', 'short_description', and 'priority', reflecting the operational handling and urgency of issues across different locations and categories. The `sys_user` table, on the other hand, contains user-related information, focusing on the details of employees or system users. This includes fields such as 'user_id', 'name', 'role', 'email', 'schedule', and 'department', providing insights into the user profiles managed within the ServiceNow system. Together, these tables are instrumental in understanding the workflow, user involvement and detailed tracking of incident resolution processes within an organization.",
        "header": "Performance Trends in Employee Agents Management (Flag 14)"
    },
    "insight_list": [
        {
            "data_type": "descriptive",
            "insight": "Fred Luddy has a significantly higher average TTR compared to other agents",
            "insight_value": {
                "x_val": "Fred Luddy",
                "y_val": 17.09
            },
            "plot": {
                "plot_type": "histogram",
                "title": "Average Time to Resolution (TTR) by Agent",
                "x_axis": {
                    "name": "Agent",
                    "value": [
                        "Beth Anglin",
                        "Charlie Whitherspoon",
                        "Fred Luddy",
                        "Howard Johnson",
                        "Luke Wilson"
                    ],
                    "description": "This represents the different agents assigned to resolve incidents."
                },
                "y_axis": {
                    "name": "Average Resolution Time (days)",
                    "value": [
                        4.26,
                        5.29,
                        17.09,
                        4.82,
                        4.98
                    ],
                    "description": "This represents the average time each agent takes to resolve incidents, measured in days."
                },
                "description": "The histogram displays the average resolution time for each agent. Each bar represents an agent and the height of the bar corresponds to the average time taken to resolve incidents. The values are annotated on each bar. Fred Luddy's bar is noticeably higher, indicating a longer average resolution time compared to his peers."
            },
            "question": "What is the average TTR of each agent as a histogram?",
            "actionable_insight": "Given that Fred Luddy's average TTR is significantly higher than his peers, it may be beneficial to investigate the specific reasons behind this anomaly. Possible actions include reviewing the complexity of incidents assigned to him, checking for any personal or systemic issues during his shifts, or providing additional support or training to help him manage his workload more efficiently.",
            "code": "import pandas as pd\nimport matplotlib.pyplot as plt\n\n# Assuming dataset_path is defined and points to the correct CSV file\ndf = pd.read_csv(dataset_path)\n\n# Convert opened_at and closed_at to datetime\ndf[\"opened_at\"] = pd.to_datetime(df[\"opened_at\"])\ndf[\"closed_at\"] = pd.to_datetime(df[\"closed_at\"])\n\n# Compute TTR in days\ndf[\"resolution_time\"] = (df[\"closed_at\"] - df[\"opened_at\"]).dt.total_seconds() / 86400\n\n# Group by 'assigned_to' and compute the average resolution time for each agent\navg_ttr_by_agent = df.groupby(\"assigned_to\")[\"resolution_time\"].mean()\n\n# Plotting the average TTR of each agent as a histogram\nax = avg_ttr_by_agent.plot(kind='bar', figsize=(10, 6), color='skyblue')\n\nplt.title(\"Average Resolution Time (TTR) by Agent\")\nplt.xlabel(\"Agent\")\nplt.ylabel(\"Average Resolution Time (days)\")\nplt.xticks(rotation=45)\n\n# Annotate each bar with its value\nfor p in ax.patches:\n    ax.annotate(format(p.get_height(), '.2f'), \n                (p.get_x() + p.get_width() / 2., p.get_height()), \n                ha = 'center', va = 'center', \n                xytext = (0, 9), \n                textcoords = 'offset points')\n\nplt.show()"
        },
        {
            "data_type": "descriptive",
            "insight": "Fred Luddy's TTR begins to increase linearly over time from a specific point of time (01-06-2023) compared to other agents who maintain a uniform TTR",
            "insight_value": {
                "x_val": "Time Period",
                "y_val": "Increasing TTR Trend for Fred Luddy from June 2023 onwards"
            },
            "plot": {
                "plot_type": "line",
                "title": "Time to Resolution (TTR) Trend Comparison Among Agents",
                "x_axis": {
                    "name": "Month-Year",
                    "value": [
                        "Jan-2023",
                        "Feb-2023",
                        "Mar-2023",
                        "Apr-2023",
                        "May-2023",
                        "..."
                    ],
                    "description": "This represents the timeline over which the TTR data is analyzed."
                },
                "y_axis": {
                    "name": "Average Resolution Time (days)",
                    "value": "line plot",
                    "description": "This represents the average time taken to resolve incidents, measured in days, across different agents."
                },
                "description": "The line plot shows the TTR trends for each agent over several months. While other agents' TTR remains relatively stable, Fred Luddy's TTR starts to increase linearly from a specific point in time. This divergence is clearly visible and raises concerns about factors influencing his performance."
            },
            "question": "How does the TTR of the specific agent compare to other agents during the same time frame?",
            "actionable_insight": "The observed linear increase in TTR for Fred Luddy suggests a potential issue that may be impacting his efficiency. It is advisable to investigate further into Fred Luddy's availability and workload, the complexity of the cases assigned, or any personal or systemic changes that occurred at the point when his TTR began to rise.",
            "code": "import pandas as pd\nimport matplotlib.pyplot as plt\n\n\n\n# Convert opened_at and closed_at to datetime\ndf[\"opened_at\"] = pd.to_datetime(df[\"opened_at\"])\ndf[\"closed_at\"] = pd.to_datetime(df[\"closed_at\"])\n\n# Compute resolution time in days\ndf[\"resolution_time\"] = (df[\"closed_at\"] - df[\"opened_at\"]).dt.total_seconds() / 86400\n\n# Extract month-year from opened_at and create a new column\ndf[\"month_year\"] = df[\"opened_at\"].dt.to_period(\"M\")\n\n# Group by month_year and category, then compute average resolution time\ndf_grouped = (\n    df.groupby([\"month_year\", \"assigned_to\"])[\"resolution_time\"].mean().unstack()\n)\n\n# Plot the data\ndf_grouped.plot(kind=\"line\", figsize=(12, 6))\nplt.title(\"Average Resolution Time by agent Over Time\")\nplt.xlabel(\"Month-Year\")\nplt.ylabel(\" Resolution Time (days) over time\")\nplt.xticks(rotation=45)\nplt.legend(title=\"Assigned_to\")\nplt.show()"
        },
        {
            "data_type": "descriptive",
            "insight": "The number of incidents assigned to each agent, including Fred Luddy, remains uniform over time",
            "insight_value": {
                "x_val": "Agents",
                "y_val": "Uniform Distribution of Incident Assignment"
            },
            "plot": {
                "plot_type": "histogram",
                "title": "Distribution of Incident Assignments Among Agents Over Time",
                "x_axis": {
                    "name": "Agent",
                    "value": [
                        "Beth Anglin",
                        "Charlie Whitherspoon",
                        "Fred Luddy",
                        "Howard Johnson",
                        "Luke Wilson"
                    ],
                    "description": "This represents the different agents handling incidents."
                },
                "y_axis": {
                    "name": "Number of Incidents",
                    "value": [
                        108,
                        91,
                        85,
                        102,
                        101
                    ],
                    "description": "This represents the count of incidents assigned to each agent over the analyzed time period."
                },
                "description": "The histogram displays the number of incidents assigned to each agent over a specific time period. The distribution is relatively uniform across all agents, indicating that workload distribution in terms of number of incidents is even. This suggests that the increasing TTR for Fred Luddy is not due to an excessive number of assignments."
            },
            "question": "What is the pattern in the number of incidents assigned to the specific agent over time?",
            "actionable_insight": "Since the distribution of incident assignments is uniform among all agents, the prolonged TTR for Fred Luddy is unlikely to be caused by an overload of assignments. It may be beneficial to explore other factors such as the complexity of the incidents assigned to Fred, his working methods, or potential personal or systemic issues that might be affecting his performance. Monitoring the trend in the number of open tickets for each agent over time could also provide additional insights into workload management and efficiency.",
            "code": "agent_incident_count = df.groupby('assigned_to')['number'].count()\n\n# Plot the histogram\nax = agent_incident_count.plot(kind='bar', figsize=(10,6))\n\nfor p in ax.patches:\n    ax.annotate(format(p.get_height(), '.2f'), \n                (p.get_x() + p.get_width() / 2., p.get_height()), \n                ha = 'center', va = 'center', \n                xytext = (0, 9), \n                textcoords = 'offset points')\nplt.title('Number of Incidents Assigned Per Agent')\nplt.xlabel('Agent')\nplt.ylabel('Number of Incidents Assigned')\nplt.grid(True)\nplt.xticks(rotation=45)\nplt.show()"
        },
        {
            "data_type": "diagnostic",
            "insight": "The number of open incidents for Fred Luddy is increasing over time, coinciding with the period where his TTR began to increase linearly",
            "insight_value": {
                "x_val": "Time Period",
                "y_val": "Increasing Number of Open Incidents from june 2023 onwards for Fred Luddy"
            },
            "plot": {
                "plot_type": "line",
                "title": "Trend of Open Incidents for Fred Luddy Over Time",
                "x_axis": {
                    "name": "Month-Year",
                    "value": [
                        "Jan-2023",
                        "Feb-2023",
                        "Mar-2023",
                        "Apr-2023",
                        "May-2023"
                    ],
                    "description": "This represents the timeline over which the open incident data is analyzed."
                },
                "y_axis": {
                    "name": "Number of Open Incidents",
                    "description": "This represents the count of incidents still open and unresolved, assigned to Fred Luddy over the analyzed time period."
                },
                "description": "The line plot illustrates a clear increasing trend in the number of open incidents assigned to Fred Luddy, starting from a specific point in time. This increase aligns with the time when his TTR also begins to rise, suggesting a potential correlation between the growing backlog of open incidents and his prolonged resolution times."
            },
            "question": "What is the pattern in the number of open incidents assigned to the specific agent over time?",
            "actionable_insight": "The increasing trend in open incidents assigned to Fred Luddy warrants further investigation, particularly in relation to his leave periods. It is crucial to assess whether these open incidents are becoming more complex or if there are other factors at play during his leave that impact his ability to close cases efficiently. Addressing this increasing backlog by redistributing workload during peak times or providing additional support during his leave could help in managing the resolution times more effectively.",
            "code": "df['opened_at'] = pd.to_datetime(df['opened_at'])\ndf['closed_at'] = pd.to_datetime(df['closed_at'])\n# Define the current date for the analysis, simulate up to the last 'opened_at' date\ncurrent_date = df['opened_at'].max()\n\n# Create a range of dates from the start to the current date\ndate_range = pd.date_range(start=df['opened_at'].min(), end=current_date, freq='D')\n\n# Function to count open incidents per date\ndef count_open_incidents(date, agent_data):\n    # Incidents that are opened on or before 'date' and are not closed or closed after 'date'\n    open_incidents = agent_data[(agent_data['opened_at'] <= date) & ((agent_data['closed_at'].isna()) | (agent_data['closed_at'] > date))]\n    return len(open_incidents)\n\n# Initialize a DataFrame to store the results\nopen_incidents_data = pd.DataFrame()\n\n# Loop through each agent to calculate their open incidents over time\nfor agent in df['assigned_to'].unique():\n    agent_data = df[df['assigned_to'] == agent]\n    open_counts = [count_open_incidents(date, agent_data) for date in date_range]\n    temp_df = pd.DataFrame({\n        'Date': date_range,\n        'Open Incidents': open_counts,\n        'Agent': agent\n    })\n    open_incidents_data = pd.concat([open_incidents_data, temp_df], ignore_index=True)\n\n# Plotting the data\nplt.figure(figsize=(14, 7))\nsns.lineplot(data=open_incidents_data, x='Date', y='Open Incidents', hue='Agent', marker='o')\nplt.title('Number of Open Incidents Over Time for Each Agent')\nplt.xlabel('Date')\nplt.ylabel('Open Incidents')\nplt.xticks(rotation=45)\nplt.grid(True)\nplt.legend(title='Agent')\nplt.show()"
        },
        {
            "data_type": "diagnostic",
            "insight": "Fred Luddy's increasing TTR correlates with his PTO period",
            "insight_value": {
                "x_val": "PTO Period",
                "y_val": "Increase in TTR"
            },
            "plot": {
                "plot_type": "timeline",
                "title": "Timeline of Fred Luddy's Leave Periods and TTR Correlation",
                "x_axis": {
                    "name": "Date",
                    "value": [
                        "2023-01-01",
                        "2023-12-31"
                    ],
                    "description": "This represents the timeline over which Fred Luddy's PTO and TTR data is analyzed."
                },
                "y_axis": {
                    "name": "Leave Indicator",
                    "value": [
                        0,
                        1
                    ],
                    "description": "This axis indicates the presence of a leave period. The value is binary, where a visible bar indicates a leave period."
                },
                "description": "The plot uses shaded red areas to visually represent the periods when Fred Luddy was on PTO. These periods are shown over a timeline that spans the current analysis period. The timeline illustrates that the increase in TTR for Fred Luddy begins to rise linearly at the onset of his first leave period and remains elevated. This visualization helps in identifying a potential correlation between his leave periods and the observed increase in TTR."
            },
            "question": "What are the dates and duration of the agent\u2019s leave (PTO)?",
            "actionable_insight": "Given the correlation between Fred Luddy's PTO periods and the increase in his TTR, it is crucial to plan for adequate coverage or support during his future leaves. This could involve redistributing his workload more effectively among other team members or providing temporary additional resources to manage the increased load. Such proactive measures could help mitigate the impact of his absence on overall service resolution times and maintain consistent performance across the team.",
            "code": "fred_schedule = df_usr[df_usr['name'] == 'Fred Luddy']['schedule'].iloc[0]\nfred_schedule = eval(fred_schedule)\nimport matplotlib.dates as mdates\n\n# Assuming df is already defined and has 'opened_at' and 'closed_at' columns converted to datetime\ndf['opened_at'] = pd.to_datetime(df['opened_at'])\ndf['closed_at'] = pd.to_datetime(df['closed_at'])\n\n# Define the current date for the analysis, simulate up to the last 'opened_at' date\ncurrent_date = df['opened_at'].max()\n# Create a range of dates from the start to the current date\ndate_range = pd.date_range(start=df['opened_at'].min(), end=current_date, freq='D')\n\n# Fred's PTO schedule as list of tuples with start and end dates\npto_schedule = fred_schedule\n\n# Plotting\nfig, ax = plt.subplots(figsize=(10, 2))  # Adjust the figure size as needed\n\n# Plot each leave period as a rectangle\nfor start, end in pto_schedule:\n    ax.axvspan(start, end, color='red', alpha=0.5, label='PTO (Leave Period)')\n\n# Set limits, labels, title and legend\nax.set_xlim([date_range.min(), date_range.max()])\nax.set_ylim(0, 1)  # Static Y limits as we are only plotting periods\nax.set_yticks([])  # Hide Y axis ticks\nax.set_xlabel('Date')\nax.set_title('Timeline of Fred Luddy\\'s Leave Periods')\nax.legend(loc='upper right')\n\n# Formatting the x-axis to make it more readable\nax.xaxis.set_major_locator(mdates.MonthLocator())\nax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))\n\nplt.xticks(rotation=45)\nplt.tight_layout()\nplt.show()"
        },
        {
            "data_type": "predictive",
            "insight": "There is a continued linear increase in TTR if workload during leave periods is not balanced",
            "insight_value": {
                "trend": "Linear Increase",
                "prediction": "Continued escalation in resolution times"
            },
            "description": "The analysis of Fred Luddy's TTR shows a clear linear increase that coincides with his periods of leave. Without intervention, this trend is predicted to continue, leading to progressively longer resolution times for incidents assigned to him. This ongoing increase in TTR could potentially result in decreased customer satisfaction, increased stress on other team members who may need to compensate informally, and a general decline in team performance.",
            "recommendation": {
                "action": "Implement workload balancing strategies during leave periods",
                "expected_outcome": "Stabilization of TTR and maintenance of service quality",
                "urgency": "High"
            },
            "actionable_insight": "To prevent the predicted escalation in TTR, it is essential to implement strategic workload balancing during Fred Luddy's leave periods. This could involve redistributing his responsibilities among team members or hiring temporary staff. Additionally, ensuring that all agents are equally trained to handle the types of incidents typically assigned to Fred could mitigate the risk of increased TTR. Proactive measures are crucial to maintain consistent performance levels and uphold customer satisfaction.",
            "code": "import pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.dates as mdates\nfrom sklearn.linear_model import LinearRegression\n\n# Load data\ndataset_path = \"csvs/flag-14.csv\"\ndf = pd.read_csv(dataset_path)\ndf = df[df['assigned_to'] == 'Fred Luddy']\ndf['opened_at'] = pd.to_datetime(df['opened_at'])\ndf['closed_at'] = pd.to_datetime(df['closed_at'])\n\n# Compute resolution time in days\ndf[\"resolution_time\"] = (df[\"closed_at\"] - df[\"opened_at\"]).dt.total_seconds() / 86400\n# Remove rows with NaN values in 'resolution_time'\ndf = df.dropna(subset=['resolution_time'])\n# Convert dates to ordinal for regression analysis\ndf['date_ordinal'] = df['opened_at'].apply(lambda x: x.toordinal())\n\n# Prepare data for linear regression\nX = df['date_ordinal'].values.reshape(-1, 1)  # Reshape for sklearn\ny = df['resolution_time'].values  # Target variable\n\n# Fit the linear regression model\nmodel = LinearRegression()\nmodel.fit(X, y)\n\n# Predict future values\nfuture_dates = pd.date_range(start=df['opened_at'].max(), periods=120, freq='D')  # 4 months into the future\nfuture_dates_ordinal = [d.toordinal() for d in future_dates]\nfuture_preds = model.predict(np.array(future_dates_ordinal).reshape(-1, 1))\n\n# Plotting\nplt.figure(figsize=(12, 6))\nplt.scatter(df['opened_at'], df['resolution_time'], color='blue', label='Historical TTR')\nplt.plot(future_dates, future_preds, color='red', linestyle='--', label='Predicted TTR Trend')\nplt.title('Projected Increase in TTR for Fred Luddy')\nplt.xlabel('Date')\nplt.ylabel('Time to Resolution (days)')\nplt.legend()\nplt.grid(True)\n\n# Formatting the x-axis to make it more readable\nplt.gca().xaxis.set_major_locator(mdates.MonthLocator())\nplt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))\n\nplt.xticks(rotation=45)\nplt.tight_layout()\nplt.show()"
        }
    ],
    "insights": [
        "Fred Luddy has a significantly higher average TTR compared to other agents",
        "Fred Luddy's TTR begins to increase linearly over time from a specific point of time (01-06-2023) compared to other agents who maintain a uniform TTR",
        "The number of incidents assigned to each agent, including Fred Luddy, remains uniform over time",
        "The number of open incidents for Fred Luddy is increasing over time, coinciding with the period where his TTR began to increase linearly",
        "Fred Luddy's increasing TTR correlates with his PTO period",
        "There is a continued linear increase in TTR if workload during leave periods is not balanced"
    ],
    "summary": "\n\n1. **Trend in Resolution Time**: Analysis reveals that Fred Luddy's time to resolution (TTR) begins to increase linearly from a certain point, suggesting potential personal or systemic challenges impacting his performance.\n   \u00df\n2. **Correlation with Open Incidents**: The increase in TTR coincides with a rise in the number of open incidents assigned to Fred Luddy. \n   \n3. **Correlation with LEave Schedule**: This pattern is likely exacerbated by periods of his leave of absence, during which unresolved incidents accumulate.\n   \n4. **Strategic Workload Management**: Addressing these trends is crucial, especially during employee absences, to prevent workload imbalances and ensure consistent performance across the team. Implementing strategic workload balancing could mitigate the impact of such absences and maintain operational efficiency."
}