[
  {
    "description": "Determine the total count of processes currently running on the student's Linux with 'bash' included in their command name, or path of operating system Ubuntu.",
    "explanation": "To solve this problem, you can use the 'ps' command to get the list of processes and 'grep' command to filter out the desired processes containing 'bash' in their command name or path. Here are a few hints on how to construct your command:\n\n1. Use 'ps -ef' to display the processes for all users.\n2. You need to use 'grep' command to find the processes containing 'bash' in their command name/path.\n3. Count the number of lines which indicates the number of matched processes.\n4. You may want to combine these steps in a pipeline for the easy retrieval of the desired result.",
    "create": {
      "init": "\n#!/bin/bash\n# This is a simple initialization script that does nothing, but you can add any necessary setup steps if needed.\necho \"Initialization completed.\""
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "ps -ef | grep -i 'bash' | grep -v 'grep' | wc -l"
    },
    "original_description": "In this problem, your task is to find the total count of processes that are currently running on the student's Linux (Ubuntu) operating system having 'bash' in their command name or path. The answer must be an integer.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "process",
          "exists": true,
          "type": "process"
        },
        {
          "fact": "Operating system is Linux (Ubuntu)",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "command_name",
            "type": "pattern",
            "value": "bash",
            "role": "search_pattern"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the count of processes with 'bash' in their command name or path"
    }
  },
  {
    "description": "Calculate the total size of everything in the given place, including those inside it, and give the output in bytes. Make sure not to use any common tools to find that directly. Find each one separately and sum it all up.",
    "explanation": "To solve this problem, you can use the `find` command to list all the files recursively in the given directory. Then you can use `-exec` with `wc` or `stat` command to get the size of each file individually and parse the output to get the size in bytes. Finally, sum the sizes and output the result.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p /home/user/experiment/dir1/dir2\nmkdir -p /home/user/experiment/dir1/dir3\nmkdir -p /home/user/experiment/dir4\necho \"Hello World\" > /home/user/experiment/dir1/file1.txt\necho \"I am a programmer\" > /home/user/experiment/dir1/file2.txt\necho \"This is a script\" > /home/user/experiment/dir1/dir2/file3.txt\necho \"Linux is fun\" > /home/user/experiment/dir1/dir3/file4.txt\necho \"Operating systems are interesting\" > /home/user/experiment/dir4/file5.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\nfind /home/user/experiment -type f -exec stat -c \"%s\" {} \\; | awk '{total+=$1} END {print total}'"
    },
    "original_description": "Your task is to calculate the total size of all files in a given directory, including files in its subdirectories, and return the total size in bytes. The given directory is `/home/user/experiment` which contains multiple subdirectories and files of different sizes.\n\nNOTE: You should not use any pre-built tools or commands to return the total directory size directly (e.g., `du`). You are required to find the size of each file individually and calculate the total sum. The answer must be an integer.",
    "perturbation_strategy": {
      "strategy_id": "referential_ambiguity",
      "strategy_name": "Referential Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "/home/user/experiment",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "The directory contains multiple subdirectories and files of different sizes",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory_path",
            "type": "path",
            "value": "/home/user/experiment",
            "role": "target"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "stat",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "awk",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the total size in bytes"
    }
  },
  {
    "description": "Count the number of log files in \"log_folder\" that were named exactly 30 days or more ago from the current date.",
    "explanation": "To solve this problem, you'll have to perform the following steps:\n1. Find the current date and subtract 30 days to get the threshold date.\n2. Use a loop to iterate through all the files in \"log_folder.\"\n3. In each iteration, compare the date in the file name with the threshold date.\n4. If the date in the file name is less than or equal to the threshold date, increment the counter.\n5. Finally, print the counter as the output.\n\nHint: You can use \"date\" command to find the current date, and GNU 'date' '-d' argument to modify dates.",
    "create": {
      "init": "\nmkdir -p log_folder\ntouch log_folder/log_2022-05-01.txt\ntouch log_folder/log_2022-06-10.txt\ntouch log_folder/log_2022-07-05.txt\ntouch log_folder/log_2022-07-15.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\nthreshold_date=$(date -d \"-30 days\" +'%Y-%m-%d')\ncounter=0\n\nfor file in log_folder/*.txt; do\n    file_date=$(basename \"$file\" | awk -F_ -vOFS='-' '{print $2}' | awk -F. -vOFS='-' '{print $1}')\n    if [[ \"$file_date\" < \"$threshold_date\" || \"$file_date\" == \"$threshold_date\" ]]; then\n        counter=$((counter + 1))\n    fi\ndone\n\necho $counter"
    },
    "original_description": "You are given a folder named \"log_folder\" containing log files from a server. These log files are named in the format \"log_YYYY-MM-DD.txt\" (for example, \"log_2022-10-01.txt\"). Your task is to find out the number of log files in the \"log_folder\" that were created exactly 30 days or more ago from the current date.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "log_folder",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "log files are named in the format 'log_YYYY-MM-DD.txt'",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "threshold_date",
            "type": "string",
            "value": "30 days ago from current date",
            "role": "filter"
          },
          {
            "name": "file_date",
            "type": "pattern",
            "value": "YYYY-MM-DD",
            "role": "filter"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number of log files created 30 days or more ago"
    }
  },
  {
    "description": "Count the number of interesting IP addresses across the log files in the \"sample_files\" directory.",
    "explanation": "To solve this problem, you can use a combination of command-line tools like grep, sed, and awk to extract the IP addresses from the log files. Then, count the unique IP addresses using the sort and uniq commands.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p sample_files\ncd sample_files\n\ncat <<EOF > server_log1.txt\n[2022-01-01 12:56:47] 192.168.1.0 Something happened\n[2022-01-02 14:34:12] 192.168.1.1 Another thing happened\n[2022-01-03 10:20:38] 192.168.1.0 What just happened\nEOF\n\ncat <<EOF > server_log2.txt\n[2022-01-04 15:14:59] 192.168.1.2 Nothing happened\n[2022-01-05 08:23:07] 192.168.1.1 Something happened again\nEOF"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\ncd sample_files\ngrep -oP '\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}' *.txt | sort | uniq | wc -l"
    },
    "original_description": "You have been provided with a directory named \"sample_files\" containing text files. These text files represent server logs, with each line representing an entry. Your task is to count the total number of unique IP addresses present across all the log files in the \"sample_files\" directory.\n\nNote: The log files follow the format, `",
    "perturbation_strategy": {
      "strategy_id": "vagueness_subjectivity",
      "strategy_name": "Vagueness/Subjectivity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "sample_files",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "log_files",
            "type": "pattern",
            "value": "*.txt",
            "role": "target"
          },
          {
            "name": "ip_address_pattern",
            "type": "pattern",
            "value": "\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}",
            "role": "search_pattern"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep -oP '\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}' *.txt | sort | uniq | wc -l",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the total number of unique IP addresses"
    }
  },
  {
    "description": "Create a pleasant directory named \"logs\" and populate it with some log files. These files should vaguely represent daily logs. Your aim is to find and consider the occurrences of something like \"error\" within these files, in a broad sense, and then come up with a sensible total that feels right for all the files together.",
    "explanation": "To solve this problem, you can use the `grep` command to search for the keyword \"error\" in the log files. You can use the `-i` flag to perform a case-insensitive search, and the `-o` flag to print each match individually. Then, use `wc -l` to count the number of matches. Finally, sum the count of matches for all log files.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p logs\nfor i in {1..7}\ndo\n    filename=\"log_$(date -d \"-$i days\" +\"%Y-%m-%d\").txt\"\n    FILE_PATH=logs/$filename\n    touch $FILE_PATH\n    for j in {1..30}\n    do\n        if [ $((RANDOM % 5)) -eq 0 ]\n        then\n            echo \"Error: something went wrong\" >> $FILE_PATH\n        else\n            echo \"Debug: this is a debug message\" >> $FILE_PATH\n        fi\n    done\ndone"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\ntotal_errors=0\n\nfor file in logs/*.txt\ndo\n    count=$(grep -i -o \"error\" $file | wc -l)\n    total_errors=$((total_errors + count))\ndone\n\necho $total_errors"
    },
    "original_description": "You are given a directory named \"logs\" containing multiple log files. Each file represents logs generated per day. The filename format is \"log_YYYY-MM-DD.txt\". Your task is to count the occurrences of the word \"error\" (case insensitive) in the log files and sum the errors in all log files.\n\nTo complete this task, perform the following steps:\n\n1. Create a directory named \"logs\".\n2. Create log files for the past 7 days inside the \"logs\" directory. The content of each log file should have random \"error\"/\"Error\" entries along with other content.\n3. Use Linux command(s) to calculate the total count of \"error\" (case insensitive) occurrences in all the log files inside the \"logs\" directory.\n\nYour final answer should be the total count of \"error\" occurrences as an integer.",
    "perturbation_strategy": {
      "strategy_id": "vagueness_subjectivity",
      "strategy_name": "Vagueness/Subjectivity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "logs",
          "exists": true,
          "type": "directory"
        },
        {
          "entity": "log_YYYY-MM-DD.txt",
          "exists": true,
          "type": "file"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "error",
            "role": "search_pattern"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "wc",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "total count of 'error' occurrences as an integer"
    }
  },
  {
    "description": "Calculate the sum, going into the \"reports\" directory, of integer values in text files, which have one line, inside. Output the result.",
    "explanation": "To solve this problem, you will use various Linux commands such as cd, find, and xargs. You will change your current directory to \"reports\", then use the find command to list all text files (*.txt) inside the directory. Next, you'll use xargs command along with cat command to read the integer values inside each text file. Finally, you'll use awk command to sum up the values and print the result.",
    "create": {
      "init": "\n#!/bin/bash\n\n# Remove any previously created 'reports' directory just in case\nrm -rf reports\n\n# Create a 'reports' directory, enter it, and create text files with integer values\nmkdir reports\ncd reports\n\necho 15 > report1.txt\necho 10 > report2.txt\necho 5 > report3.txt\necho 20 > report4.txt\necho 50 > report5.txt\necho -10 > report6.txt\n\n# Move back to the parent directory\ncd .."
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\n\n# Change the current directory to 'reports'\ncd reports\n\n# List all the text files in the reports directory\n# Read the integer values from the text files\n# Sum up all the values and print the result\nfind . -type f -name '*.txt' | xargs cat | awk '{sum+=$1} END{print sum}'\n\n# Output: 90"
    },
    "original_description": "You are given a directory called \"reports\". Inside the directory \"reports\", there are many text files, each containing one line with an integer value. You need to calculate the sum of all integer values from all these text files, and output the result as an integer.\n\nIn summary, your task is to:\n\n1. Enter the \"reports\" directory.\n2. Read all the text files inside the \"reports\" directory.\n3. Calculate the sum of integer values inside each text file.\n4. Output the final sum as an integer.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "reports",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "Each text file contains one line with an integer value",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory",
            "type": "path",
            "value": "reports",
            "role": "target"
          },
          {
            "name": "file_type",
            "type": "pattern",
            "value": "*.txt",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find . -type f -name '*.txt' | xargs cat | awk '{sum+=$1} END{print sum}'",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer sum of all integer values from the text files"
    }
  },
  {
    "description": "Calculate the total sum of integers in all the files in the `example_files` directory that do not contain both numbers and alphabetic strings, ensuring to exclude any line numbers.",
    "explanation": "To solve the problem, you can use a combination of shell commands:\n\n1. Use the `find` command to get all the files in the `example_files` directory.\n2. Use a loop to read through each file content with the `cat` command.\n3. Use `grep` to filter lines containing only integers.\n4. Calculate the sum using the 'awk' command.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p example_files\n\n# Create sample files with random strings and integers\necho \"abc\" > example_files/file1.txt\necho \"3\" >> example_files/file1.txt\necho \"4hj\" >> example_files/file1.txt\necho \"5\" >> example_files/file1.txt\n\necho \"5\" > example_files/file2.txt\necho \"2\" >> example_files/file2.txt\necho \"9\" >> example_files/file2.txt\necho \"3\" >> example_files/file2.txt\n\necho \"xyz\" > example_files/file3.txt\necho \"12gh\" >> example_files/file3.txt\necho \"34\" >> example_files/file3.txt\necho \"56\" >> example_files/file3.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\nsum=0\n# Step 1: Find the files in the directory\nfiles=$(find example_files -type f)\n\n# Step 2-4: Loop through the files, filter lines with only integers and calculate the sum\nfor file in $files\ndo\n  file_sum=$(cat \"$file\" | grep -E '^[0-9]+$' | awk '{s+=$1}END{print s}')\n  sum=$((sum + file_sum))\ndone\n\n# The output must contain only integers\necho $sum"
    },
    "original_description": "You are given a directory named `example_files` containing text files with random alphabetic strings and integers in each line. Your task is to calculate the total sum of integers in all the files in the `example_files` directory. Note that you should not consider any number that contains both integers and alphabetic characters. \n\nFor example, if a file contains:\n\n```\nabc\n3\n4hj\n5\n```\n\nOnly consider `3` and `5`. The sum for this file would be `8`.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "example_files",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "files",
            "type": "path",
            "value": "example_files",
            "role": "target"
          },
          {
            "name": "pattern",
            "type": "pattern",
            "value": "^[0-9]+$",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "search_pattern"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "awk",
            "role": "calculate"
          }
        ],
        "optional": []
      },
      "expected_output": "integer sum of all valid integers in the files"
    }
  },
  {
    "description": "As a student, you are requested to find the total number of files and directories, including its subdirectories, inside a specific directory named \"project_directory\" using an integer output representing the total count with extra information.",
    "explanation": "You are supposed to use standard Linux commands available in an Ubuntu operating system. You will require 'find' command to search for all files and directories inside the given directory and its subdirectories. Then, you will pipe the output to the 'wc' command with the \"-l\" flag to count the number of lines, which reflects the total count of the files and directories. You will also need to hide any possible error message from the output.",
    "create": {
      "init": "\n#!/bin/bash\n\n# Create a directory with a specific structure\nmkdir -p project_directory/subdir1/subdir1_1\nmkdir -p project_directory/subdir1/subdir1_2\nmkdir -p project_directory/subdir2/subdir2_1\nmkdir -p project_directory/subdir3\n\n# Add some files\ntouch project_directory/file1.txt\ntouch project_directory/subdir1/file2.txt\ntouch project_directory/subdir1/subdir1_1/file3.txt\ntouch project_directory/subdir1/subdir1_2/file4.txt\ntouch project_directory/subdir2/file5.txt\ntouch project_directory/subdir2/subdir2_1/file6.txt\ntouch project_directory/subdir3/file7.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "find project_directory -type f -o -type d 2>/dev/null | wc -l"
    },
    "original_description": "As a student, you are requested to find the total number of files and directories inside a specific directory, including its subdirectories. The output should be an integer representing the total count without any extra information. The given directory will be named \"project_directory\".",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "project_directory",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory",
            "type": "path",
            "value": "project_directory",
            "role": "target"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "wc",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the total count of files and directories"
    }
  },
  {
    "description": "Count how many active lines are in a log, which holds system activities. Begin by creating a log named `system_logs.txt` with arbitrary entries, then use a program in the Linux (Ubuntu) environment to determine the tally of active lines. The output should be a number.",
    "explanation": "1. Create a log file named `system_logs.txt` with random content.\n2. Use a Linux command to calculate the number of non-empty lines in `system_logs.txt`.\n3. Print the output as an integer, which shows the number of non-empty lines in the log file.",
    "create": {
      "init": "\n# Randomly generate a log file system_logs.txt\ncat /dev/urandom | tr -dc 'a-zA-Z0-9!@#$%^&*()_+-=[]{}|;:\",.<>/?\\` ~\\n' | head -c 5000 > system_logs.txt\n# Add some empty lines to system_logs.txt\nfor i in {1..20}; do\n  if (( $i % 3 == 0 )); then\n    sed -i \"${i}s/$/\\n/\" system_logs.txt\n  fi\ndone"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "# Find the number of non-empty lines in system_logs.txt and print the result\ngrep -c -v \"^$\" system_logs.txt"
    },
    "original_description": "In this problem, you need to find the number of non-empty lines in a log file, which contains records of various system activities. First, you will initialize a log file named `system_logs.txt` with random content, then you will write a program in the Linux (Ubuntu) operating system to count the number of non-empty lines. The result should be an integer.",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "system_logs.txt",
          "exists": true,
          "type": "file"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "grep",
            "type": "command",
            "value": "grep -c -v \"^$\" system_logs.txt",
            "role": "command"
          },
          {
            "name": "pattern",
            "type": "pattern",
            "value": "^$",
            "role": "filter"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number of non-empty lines"
    }
  },
  {
    "description": "As a Linux system administrator, you need to monitor the usage of the `/home` directory with its subdirectories because it is very important to ensure that the system smoothly runs. You have to determine the total number of files larger than 500 KB with the `.log` extension in the `/home` directory.\n\nTo solve this problem, you need to do the following:\n\n1. Execute an initialization bash script to create the required environment for the problem in each student's Linux operating system with Ubuntu.\n2. Write a script that will output a single integer - the total number of `.log` files in the `/home` directory and its subdirectories larger than 500 KB.",
    "explanation": "To solve this problem, you can use `find` command to search for the files with the `.log` extension larger than 500 KB in the `/home` directory and its subdirectories. You can then pass the results through `wc -l` command to count the number of files. ",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p /home/testdir{1..3}\ntouch /home/testdir1/file{1..3}.log\ntouch /home/testdir2/file{4..6}.log\ntouch /home/testdir3/file{7..9}.log\n\ntruncate -s 300KB /home/testdir1/file1.log\ntruncate -s 600KB /home/testdir1/file2.log\ntruncate -s 550KB /home/testdir1/file3.log\ntruncate -s 400KB /home/testdir2/file4.log\ntruncate -s 200KB /home/testdir2/file5.log\ntruncate -s 700KB /home/testdir2/file6.log\ntruncate -s 800KB /home/testdir3/file7.log\ntruncate -s 500KB /home/testdir3/file8.log\ntruncate -s 250KB /home/testdir3/file9.log"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "find /home -type f -name \"*.log\" -size +500k | wc -l"
    },
    "original_description": "As a Linux system administrator, you need to monitor the disk usage of the `/home` directory because it is very important to ensure that the system runs smoothly. You have to determine the total number of files with the `.log` extension that are larger than 500 KB in size within the `/home` directory and its subdirectories.\n\nTo solve this problem, you need to do the following:\n\n1. Execute an initialization bash script to create the required environment for the problem in each student's Linux (Ubuntu) operating system.\n2. Write a script that will output a single integer - the total number of `.log` files larger than 500 KB in the `/home` directory and its subdirectories.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "/home",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "The system must be running a Linux (Ubuntu) operating system",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory",
            "type": "path",
            "value": "/home",
            "role": "target"
          },
          {
            "name": "file_extension",
            "type": "pattern",
            "value": "*.log",
            "role": "filter"
          },
          {
            "name": "file_size",
            "type": "string",
            "value": "+500k",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "wc -l",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the total count of .log files larger than 500 KB"
    }
  },
  {
    "description": "Find the total time needed for all of them to be finished by everyone.",
    "explanation": "To solve the problem, you need to perform the following steps:\n\n1. List all files in the 'employees' folder.\n2. Extract all urgent tasks by using 'grep' and filtering for lines that start with 'urgent:'.\n3. Extract the time values from the urgent tasks by using 'cut' or 'awk'.\n4. Calculate the sum of all extracted integers.",
    "create": {
      "init": "\n#!/bin/bash\n\nmkdir employees\necho -e \"urgent:30\\nnormal:10\\nurgent:20\\nnormal:15\" > employees/101.txt\necho -e \"normal:20\\nurgent:40\\nnormal:10\\nurgent:15\" > employees/102.txt\necho -e \"normal:25\\nnormal:15\\nurgent:35\\nnormal:10\" > employees/103.txt\necho -e \"urgent:10\\nurgent:20\\nnormal:30\\nnormal:15\" > employees/104.txt\necho -e \"normal:35\\nnormal:20\\nurgent:25\\nurgent:15\" > employees/105.txt\necho -e \"normal:10\\nnormal:15\\nurgent:20\\nnormal:30\" > employees/106.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\n\n# Calculate the sum of all urgent task times\ntotal_time=0\nfor file in employees/*.txt; do\n  while read -r task; do\n    time=$(echo $task | cut -d ':' -f 2)\n    total_time=$((total_time + time))\n  done < <(grep '^urgent:' \"${file}\")\ndone\n\necho $total_time"
    },
    "original_description": "A company stores their employee information in the form of text files in a folder called 'employees'. Each employee has a file named {employee_id}.txt, and each file contains numbered tasks assigned to the respective employee. The tasks are labeled as either 'urgent' or 'normal', and the label is followed by a colon and a positive integer denoting the time it takes to complete the task (in minutes).\n\nThe folder 'employees' contains the following six files: 101.txt, 102.txt, 103.txt, 104.txt, 105.txt, and 106.txt. Your task is to find the total amount of time, in minutes, needed to complete all urgent tasks assigned to all employees. The final result should be a single integer, which is the sum of urgent task times.\n\nExample of task list in an employee's file:\n```\nurgent:30\nnormal:10\nurgent:20\nnormal:15\n```",
    "perturbation_strategy": {
      "strategy_id": "referential_ambiguity",
      "strategy_name": "Referential Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "employees",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "Each employee file contains tasks labeled as 'urgent' or 'normal'",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "file_pattern",
            "type": "pattern",
            "value": "employees/*.txt",
            "role": "target"
          },
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "^urgent:",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep",
            "role": "command"
          },
          {
            "name": "operation",
            "type": "command",
            "value": "cut",
            "role": "command"
          },
          {
            "name": "operation",
            "type": "command",
            "value": "sum",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "Total time in minutes needed to complete all urgent tasks"
    }
  },
  {
    "description": "Count the lines in all \".txt\" files in the directory \"files-dir\" that contain the word \"drive\".",
    "explanation": "To solve this problem, you could use the following steps:\n\n1. Traverse through all the \".txt\" files in the \"files-dir\" directory.\n2. For each file, check the lines containing the word \"Linux\".\n3. Calculate the total count of lines containing the word \"Linux\" and output the result.\n\nHint: You might find commands like 'find', 'grep', and 'wc' helpful in solving this problem.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir -p files-dir\necho \"This is a Linux operating system.\" > files-dir/file1.txt\necho \"I like Linux a lot!\" >> files-dir/file1.txt\necho \"Hello, world!\" > files-dir/file2.txt\necho \"Linux is my favorite operating system.\" > files-dir/file3.txt\necho \"Linux is fun!\" > files-dir/file4.txt\necho \"I am using Ubuntu, which is based on Linux.\" >> files-dir/file4.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\nfind files-dir -name \"*.txt\" -exec grep -c 'Linux' {} \\; | awk '{sum += $1} END {print sum}'"
    },
    "original_description": "As a student, you are requested to find out the total number of lines containing a specific word in all the \".txt\" files within a given directory. You will be given the word \"Linux\" and a directory named \"files-dir\" containing some \".txt\" files with random content. The output should be a single integer representing the total count of lines containing the word \"Linux\" in all the \".txt\" files.",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "files-dir",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "contains .txt files",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory",
            "type": "path",
            "value": "files-dir",
            "role": "target"
          },
          {
            "name": "file_extension",
            "type": "string",
            "value": ".txt",
            "role": "filter"
          },
          {
            "name": "search_word",
            "type": "string",
            "value": "Linux",
            "role": "search_pattern"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the total count of lines containing the word 'Linux'"
    }
  },
  {
    "description": "Find the most appropriate number of times the word \"Linux\" appears in the best-suited files within the \"assignment\" area.",
    "explanation": "You can solve the problem using the following steps:\n\n1. Navigate to the `~/assignment/` directory.\n2. Identify the text files (`.txt` extension) in the directory.\n3. Use commands like `grep`, `wc`, and a loop to iterate through the text files and count the occurrences of the word \"Linux.\"\n  ",
    "create": {
      "init": "\nmkdir -p ~/assignment\necho \"This is a file about Linux.\" > ~/assignment/file1.txt\necho \"Linux is an open-source operating system.\" >> ~/assignment/file1.txt\necho \"Many developers use Linux for their work.\" > ~/assignment/file2.txt\ntouch ~/assignment/some_binary_file.bin\necho \"Linux can be configured to meet diverse requirements.\" > ~/assignment/file3.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\n\ncd ~/assignment\ntotal_count=0\n\nfor file in *.txt; do\n    count=$(grep -io \"Linux\" \"$file\" | wc -l)\n    total_count=$((total_count + count))\ndone\n\necho $total_count"
    },
    "original_description": "The objective of this problem is to calculate the total number of occurrences of a specific word in the files inside a given directory. You will need to navigate through the Linux operating system, use shell commands and produce a single integer as output.\n\nHere is the detailed question description:\n\n1. You are given a directory named \"assignment\" in your home directory (`~/assignment/`).\n2. Inside the \"assignment\" directory, there are multiple files - both text and non-text.\n3. Your task is to find the total number of occurrences of the word \"Linux\" (case insensitive) in all the text files within this directory.",
    "perturbation_strategy": {
      "strategy_id": "vagueness_subjectivity",
      "strategy_name": "Vagueness/Subjectivity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "~/assignment",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory_path",
            "type": "path",
            "value": "~/assignment",
            "role": "target"
          },
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "Linux",
            "role": "filter"
          },
          {
            "name": "file_extension",
            "type": "string",
            "value": ".txt",
            "role": "filter"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the total number of occurrences of the word 'Linux' in all text files"
    }
  },
  {
    "description": "Calculate the total size in bytes of all files inside the \"example\" directory with the extension \".txt\" and its subdirectories.",
    "explanation": "To solve the problem, you will have to use several Linux shell commands such as `find`, `wc`, and `du`. You will find all the txt files and calculate their size, then sum up the sizes to get the answer. Here are some hints:\n\n1. Use the `find` command to locate all the \".txt\" files inside the \"example\" directory and its subdirectories.\n2. Use the `du` command to calculate the size of each file in bytes.\n3. Sum up the sizes by using a loop or an apposite command.",
    "create": {
      "init": "\n#!/bin/bash\n\nmkdir -p example/sub1/subsub1\nmkdir -p example/sub2\n\necho \"Hello, this is a test file!\" > example/sub1/test1.txt\necho \"Another test file with some text.\" > example/sub1/subsub1/test2.txt\necho \"One more test file here!\" > example/sub1/subsub1/test3.txt\necho \"Final test file for you.\" > example/sub2/test4.txt\n"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "# Find all .txt files and calculate their size in bytes, then sum the sizes\nfind example/ -iname \"*.txt\" -exec du -b {} \\; | awk '{s+=$1} END {print s}'"
    },
    "original_description": "You are required to solve the following problem related to the Linux operating system. You need to calculate the total size (in bytes) of all the files with extension \".txt\" inside the \"example\" directory and its subdirectories. The answer should be an integer representing the sum of the sizes of all .txt files.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "example",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "*.txt",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "search"
          },
          {
            "name": "command",
            "type": "command",
            "value": "du",
            "role": "calculate_size"
          },
          {
            "name": "command",
            "type": "command",
            "value": "awk",
            "role": "sum_sizes"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the sum of the sizes of all .txt files in bytes"
    }
  },
  {
    "description": "Create a new place for it. Make sure that it is inside there. Count those with that ending and show how many are like that.",
    "explanation": "To solve this problem, you may follow these steps:\n\n1. Use `mkdir` to create the \"files_list\" directory.\n2. Use `touch` to create \"file_names.txt\" inside \"files_list\".\n3. Use `echo` or a text editor to add file names with different extensions to the \"file_names.txt\" file.\n4. Use `grep` with the specified extension as a pattern (e.g., `\".txt\"`) to count the lines containing that extension in \"file_names.txt\".\n5. Use `wc` to get the total number of lines and output it as an integer.",
    "create": {
      "init": "\nmkdir files_list\ntouch files_list/file_names.txt\necho \"file1.txt\" > files_list/file_names.txt\necho \"file2.log\" >> files_list/file_names.txt\necho \"file3.txt\" >> files_list/file_names.txt\necho \"file4.pdf\" >> files_list/file_names.txt\necho \"file5.txt\" >> files_list/file_names.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "grep \"\\.txt\" files_list/file_names.txt | wc -l"
    },
    "original_description": "In this problem, you will write a bash script that reads a text file containing a list of file names, one name per line. The text file will be placed in a newly created directory. The script should fetch the total number of lines containing a specified file extension (e.g., \".txt\"). You may assume that file names are alphanumeric, and the file extension will have a period followed by 3 lowercase letters.\n\nHere is a brief description of the tasks:\n\n1. Create a new directory named \"files_list\".\n2. Create a text file named \"file_names.txt\" inside the \"files_list\" directory containing a list of file names, one name per line.\n3. Read the \"file_names.txt\" file, and count the lines containing a specified file extension provided as an argument (e.g., \".txt\").\n4. Output the count as an integer.",
    "perturbation_strategy": {
      "strategy_id": "referential_ambiguity",
      "strategy_name": "Referential Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "files_list",
          "exists": true,
          "type": "directory"
        },
        {
          "entity": "file_names.txt",
          "exists": true,
          "type": "file"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "file_extension",
            "type": "pattern",
            "value": ".txt",
            "role": "search_pattern"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "wc -l",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer count of lines containing the specified file extension"
    }
  },
  {
    "description": "As a student, you are asked to create a directory structure under your home directory and determine the total number of created directories as your answer. The created directory structure is outlined as follows:\n\n1. Create a main directory named \"StudentFiles\" in your home directory.\n2. Inside \"StudentFiles,\" create three sub-directories named \"SectionA\", \"SectionB\", and \"SectionC\".\n3. In each of the sub-directories, create three more sections named \"Project1\", \"Project2\", and \"Project3\".\n\nSubmit your answer as the total number of sections created, including the main directory.",
    "explanation": "To solve this problem, you can first create the main directory and the three sub-directories (ClassA, ClassB, and ClassC) under the \"StudentFiles\" main directory. Next, you can create three sub-directories (Project1, Project2, and Project3) within each class directory. After creating all the directories, use the `find` command to count the total number of directories created.\n\nHint: You can use `mkdir -p` to create nested directories in a single command.",
    "create": {
      "init": "\n#!/bin/bash\nHOME_DIR=~\n\n# Remove existing directory if it exists\nrm -rf \"${HOME_DIR}/StudentFiles\"\n\n# Create main directory in the home directory\nmkdir \"${HOME_DIR}/StudentFiles\""
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\n# Create the directory structure\nmkdir -p ~/StudentFiles/Class{A,B,C}/{Project1,Project2,Project3}\n\n# Count the number of directories created (including main StudentFiles directory)\nfind ~/StudentFiles -type d | wc -l"
    },
    "original_description": "As a student, you are asked to create a directory structure under your home directory and determine the total number of created directories as your answer. The created directory structure is outlined as follows:\n\n1. Create a main directory named \"StudentFiles\" in your home directory.\n2. Inside \"StudentFiles,\" create three sub-directories named \"ClassA\", \"ClassB\", and \"ClassC\".\n3. In each of the sub-directories, create three more sub-directories named \"Project1\", \"Project2\", and \"Project3\".\n\nSubmit your answer as the total number of directories created, including the main directory.",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "~/StudentFiles",
          "exists": false,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "command",
            "type": "command",
            "value": "mkdir -p ~/StudentFiles/Class{A,B,C}/{Project1,Project2,Project3}",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find ~/StudentFiles -type d | wc -l",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "total number of directories created"
    }
  },
  {
    "description": "Count how many times the error with the code 'ERR0003' appears in the directory of logs, which may include different types of logs from various sources.",
    "explanation": "To solve this problem, you can use grep to search for the specific error code in all the log files within the `log_files` directory. Then, pipe the output to the \"wc -l\" command to count the total number of occurrences of that error code.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir log_files\n\necho \"2022-02-28T10:30:23Z:ERR0001:Permission denied.\" > log_files/server1.log\necho \"2022-02-28T10:31:42Z:ERR0003:Failed to connect to the database.\" >> log_files/server1.log\n\necho \"2022-02-28T10:40:12Z:ERR0002:Invalid input.\" > log_files/server2.log\necho \"2022-02-28T10:45:19Z:ERR0003:Failed to connect to the database.\" >> log_files/server2.log\necho \"2022-02-28T10:50:28Z:ERR0003:Failed to connect to the database.\" >> log_files/server2.log"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "grep -r \"ERR0003\" log_files | wc -l"
    },
    "original_description": "As a student, you are given a directory named `log_files` containing log files from multiple servers. The log files are named as \"server1.log\", \"server2.log\", etc. Each log file contains a list of errors observed on that server.\n\nThe error messages have a specific format: a timestamp followed by an error code and error message, separated by colons. For example:\n\n```\n2022-02-28T10:30:23Z:ERR0001:Permission denied.\n2022-02-28T10:31:42Z:ERR0003:Failed to connect to the database.\n```\n\nYour task is to calculate the total number of errors with the error code 'ERR0003' found in all log files present in the `log_files` directory. \n\nThe answer must be an integer representing the total count of the 'ERR0003' error code in all log files.",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "log_files",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "error_code",
            "type": "pattern",
            "value": "ERR0003",
            "role": "search_pattern"
          },
          {
            "name": "directory",
            "type": "path",
            "value": "log_files",
            "role": "target"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the total count of 'ERR0003' error code occurrences"
    }
  },
  {
    "description": "As a student getting familiar with Linux, your task is to come up with a nice directory structure using some general guidelines and then work out how many directories you've created in total.\n\nUsing your Ubuntu command line, loosely follow these ideas:\n\n1. Set up a main directory called 'parentDir'.\n2. Within 'parentDir', form a trio of directories named 'level1_A', 'level1_B', and 'level1_C'.\n3. Nest a couple of directories inside each of the first-level ones with the names 'level2_1' and 'level2_2'.\n4. After establishing the directory layout, determine and provide a sensible count of the directories you made (including the main one).",
    "explanation": "To solve this problem, you can use 'mkdir' to create the directories and 'find' to count the total number of directories created. Perform the following tasks in the command line:\n\n1. Create the directory structure by using the 'mkdir -p' command, which allows you to create multiple sub-directories in a single command.\n2. Navigate into the 'parentDir' directory.\n3. Use the 'find' command along with 'wc -l' to count the total number of directories within the 'parentDir' directory.",
    "create": {
      "init": "\n# There is no required initialization code for this problem, as students must create the directory structure themselves."
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "# Create directories\nmkdir -p parentDir/level1_{A,B,C}/{level2_1,level2_2}\n\n# Navigate into the parent directory\ncd parentDir\n\n# Count the total number of directories\nfind . -type d | wc -l"
    },
    "original_description": "As a student learning Linux operating systems, you are tasked to create a simple directory structure with given rules and then count the total number of directories created. \n\nUsing the command line in your Ubuntu operating system, follow these steps:\n\n1. Create a parent directory named 'parentDir'.\n2. Inside 'parentDir', create 3 directories named 'level1_A', 'level1_B', and 'level1_C'.\n3. Inside each of the level 1 directories, create 2 subdirectories named 'level2_1' and 'level2_2'.\n4. Once the directory structure is created, find and submit the total count of directories created (including the parent directory).",
    "perturbation_strategy": {
      "strategy_id": "vagueness_subjectivity",
      "strategy_name": "Vagueness/Subjectivity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "parentDir",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "parentDir contains directories level1_A, level1_B, level1_C each with subdirectories level2_1 and level2_2",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "command",
            "type": "command",
            "value": "find . -type d | wc -l",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "total count of directories created"
    }
  },
  {
    "description": "Calculate the overall size of everything related to it, ensuring it's just the usual stuff, and tell me the number.",
    "explanation": "To solve this problem, you need to traverse the directory and its subdirectories while searching for files with a \".txt\" extension. You can use 'find', 'stat', and 'awk' commands to achieve this. Here's a breakdown of the solution:\n\n1. Use the 'find' command to search for all \".txt\" files in the directory and its subdirectories.\n2. Use the 'stat' command to obtain the size (in bytes) of each found \".txt\" file.\n3. Use 'awk' to sum up all the sizes of the \".txt\" files.",
    "create": {
      "init": "\n#!/bin/bash\n\n# Create test directory structure\nmkdir -p test_environment/dir1/dir1-1\nmkdir -p test_environment/dir1/dir2-1\nmkdir -p test_environment/dir2\n\n# Create test files\ntouch test_environment/file1.txt\ntruncate -s 10 test_environment/file1.txt\n\ntouch test_environment/file2.txt\ntruncate -s 20 test_environment/file2.txt\n\ntouch test_environment/dir1/dir1-1/file3.txt\ntruncate -s 30 test_environment/dir1/dir1-1/file3.txt\n\ntouch test_environment/dir1/dir2-1/file4.txt\ntruncate -s 40 test_environment/dir1/dir2-1/file4.txt\n\n# Initialize student's working directory\ncp -r test_environment student_directory\n\n# Go to student's working directory\ncd student_directory"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\n\n# Find and sum the size of all \".txt\" files\nfind . -type f -iname \"*.txt\" -exec stat -c \"%s\" {} \\; | awk '{sum += $1} END {print sum}'"
    },
    "original_description": "As an expert in the Linux operating system, you have been given a directory with multiple subdirectories and files of various types and sizes. Your task is to calculate the total size of all files within the main directory and its subdirectories that have a \".txt\" extension. The answer should be in bytes. You must submit a single integer value as the answer.\n\nYou will be provided with a test environment. The following items should be noted:\n\n1. The subdirectories do not have a specific depth and may contain more nested subdirectories.\n2. Some file and directory names might have spaces.\n3. You should consider only regular files (not symbolic links, devices, or other special files).",
    "perturbation_strategy": {
      "strategy_id": "referential_ambiguity",
      "strategy_name": "Referential Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "main directory",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "subdirectories may contain nested subdirectories",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "file_extension",
            "type": "pattern",
            "value": ".txt",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "stat",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "awk",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the total size in bytes"
    }
  },
  {
    "description": "As someone who uses Linux, try to determine a sensible number of lines with the word \"linux\" in some files located in a folder, ignoring the ones that don't fit the usual criteria.",
    "explanation": "To solve this problem, you should use the following tools and concepts:\n\n1. Use the `grep` command to search for the target word in multiple files.\n2. Use `grep` with the `-c` option to count the matching lines.\n3. Use the `find` command to search for files in the given directory.\n4. Combine the `find` and `grep` commands using the `xargs` command to process the output of one command as input to another command, especially when there are whitespace or special characters in filenames.\n5. Pipe the output of the `grep` commands into the `awk` command to sum the individual counts.\n\nSteps to follow:\n\n1. Use `find` to list all the files in the current directory.\n2. Use `xargs` to pass the list of found files to `grep -c`.\n3. Use `grep -c` to count the lines containing the word \"linux\" in each file.\n4. Pipe the output of the `grep` counts into an `awk` command that will sum the counts and print the total.",
    "create": {
      "init": "\n#!/bin/bash\n\nmkdir -p problem_directory\ncd problem_directory\n\necho \"This is a file with the word linux in it.\" > file1.txt\necho \"Another line with the word linux.\" >> file1.txt\n\necho \"This is another file that has linux in its content.\" > file2.txt\necho \"Yet another line mentioning linux.\" >> file2.txt\n\necho \"This file has nothing to do with linux.\" > file3.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "find ./problem_directory -maxdepth 1 -type f -name \"*.txt\" | xargs grep -c \"linux\" | awk '{sum += $1} END {print sum}'"
    },
    "original_description": "As a Linux user, you are required to find the total number of lines that contain a specific word or string across multiple files in a given directory (excluding subdirectories). You will use the word \"linux\" to search within the files. Your answer should be provided as an integer count of the number of lines containing the target word.",
    "perturbation_strategy": {
      "strategy_id": "vagueness_subjectivity",
      "strategy_name": "Vagueness/Subjectivity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "problem_directory",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "files are not in subdirectories",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory_path",
            "type": "path",
            "value": "./problem_directory",
            "role": "target"
          },
          {
            "name": "file_type",
            "type": "string",
            "value": "*.txt",
            "role": "filter"
          },
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "linux",
            "role": "search_pattern"
          }
        ],
        "optional": []
      },
      "expected_output": "integer count of lines containing the word 'linux'"
    }
  },
  {
    "description": "You are given a directory named \"files\" containing a variable number of text files. Each text file contains a single line of comma-separated figures. Your task is to compute the sum of all figures in each file, and then find the highest sum among all the files in the directory.\n\nFirst, execute an initialization bash script to deploy the environment required for the task. After executing the script, you will find a directory named \"files\" containing multiple text files in your current directory.\n\nYour job is to interact with the shell to calculate the sum of figures in each file and find the highest sum. When you think you have an answer, submit the highest sum as a count.",
    "explanation": "To solve this problem, you can follow these steps:\n\n1. Navigate to the \"files\" directory.\n2. Read the content of each file and calculate the sum.\n3. Store the maximum sum as you iterate through the files.\n4. Display the highest sum as the final output.",
    "create": {
      "init": "\n#!/bin/bash\nmkdir files\ntouch files/file1.txt files/file2.txt files/file3.txt\necho \"5,10,15,20\" > files/file1.txt\necho \"2,4,6,8\" > files/file2.txt\necho \"10,20,30,40\" > files/file3.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\ncd files\nmax_sum=0\nfor file in *\ndo\n  sum=$(awk -F ',' '{for (i=1; i<=NF; ++i) total+=$i} END {print total}' $file)\n  if [ $sum -gt $max_sum ]\n  then\n    max_sum=$sum\n  fi\ndone\n\necho $max_sum"
    },
    "original_description": "You are given a directory named \"files\" containing a variable number of text files. Each text file contains a single line of comma-separated integers. Your task is to compute the sum of all integers in each file, and then find the highest sum among all the files in the directory.\n\nFirst, execute an initialization bash script to deploy the environment required for the problem. After executing the script, you will find a directory named \"files\" containing multiple text files in your current directory.\n\nYour job is to interact with the shell to calculate the sum of integers in each file and find the highest sum. When you think you have an answer, submit the highest sum as an integer.",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "files",
          "exists": true,
          "type": "directory"
        },
        {
          "fact": "Initialization bash script has been executed",
          "must_be_true": true
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "bash_script",
            "type": "command",
            "value": "Initialization bash script",
            "role": "command"
          },
          {
            "name": "directory",
            "type": "path",
            "value": "files",
            "role": "target"
          },
          {
            "name": "operation",
            "type": "command",
            "value": "awk",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "highest sum of integers from all files as an integer"
    }
  },
  {
    "description": "As a Linux system administrator, you are tasked with demonstrating your ability to iterate through directories and calculate the total number of files of a certain kind.\n\nAssuming you are given a specific file extension (e.g., \".txt\"), count the total number of documents with that extension under the \"/home/user/documents\" directory and all its subdirectories. Note that the \"/home/user/documents\" directory could be arbitrarily nested with subdirectories containing no limit to the number of files or other directories.\n\nThe output should be a single integer (the total count of documents with the given extension).",
    "explanation": "To solve this problem, you can use the 'find' command to search for the files with the given extension within the targeted directory and its subdirectories. Next, pipe the search output to the 'wc' command to count the number of lines in the search output, ultimately calculating the total number of files.\n\nYou can use this command pattern to perform the task:\n\n```bash\nfind /path/to/directory -type f -iname \"*.extension\" | wc -l\n```",
    "create": {
      "init": "\n#!/bin/bash\n\n# Create directory structure and files for testing purposes\nmkdir -p /home/user/documents/sub1/sub1_1\nmkdir -p /home/user/documents/sub1/sub1_2\nmkdir -p /home/user/documents/sub2/sub2_1\ntouch /home/user/documents/file1.txt\ntouch /home/user/documents/sub1/file2.txt\ntouch /home/user/documents/sub1/sub1_1/file3.txt\ntouch /home/user/documents/sub1/sub1_1/file4.txt\ntouch /home/user/documents/sub1/sub1_2/file5.doc\ntouch /home/user/documents/sub1/sub1_2/file6.txt\ntouch /home/user/documents/sub2/sub2_1/file7.txt\ntouch /home/user/documents/sub2/sub2_1/file8.pdf"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "find /home/user/documents -type f -iname \"*.txt\" | wc -l"
    },
    "original_description": "As a Linux system administrator, you are tasked with demonstrating your ability to iterate through directories and calculate the total number of files of a specific type.\n\nAssuming you are given a specific file extension (e.g., \".txt\"), count the total number of files with that extension under the \"/home/user/documents\" directory and all its subdirectories. Note that the \"/home/user/documents\" directory could be arbitrarily nested with subdirectories containing no limit to the number of files or other directories.\n\nThe output should be a single integer (the total count of files with the given extension).",
    "perturbation_strategy": {
      "strategy_id": "lexical_ambiguity",
      "strategy_name": "Lexical Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "/home/user/documents",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory_path",
            "type": "path",
            "value": "/home/user/documents",
            "role": "target"
          },
          {
            "name": "file_extension",
            "type": "pattern",
            "value": "*.txt",
            "role": "filter"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the total count of files with the specified extension"
    }
  },
  {
    "description": "Create a script, as a Linux user, to calculate the total storage in kilobytes used by the regular files in the directory \"/var/log\" and print an integer, considering the files within subdirectories.",
    "explanation": "To solve this problem, first, use the find command to filter the regular files in the \"/var/log\" directory. Then, use du command to calculate the storage used by each of these files. Finally, sum up the storage values for all the files.\n\nHint: You can use pipes and xargs with the find and du commands, to make your script more elegant and efficient. To get the sum of values in a column, you can use awk.",
    "create": {
      "init": "\n#!/bin/bash\n# This script doesn't need any initialization because the /var/log directory exists in all Linux distributions as a default."
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "#!/bin/bash\nfind /var/log -type f -exec du -k {} + | awk '{sum += $1} END {print sum}'"
    },
    "original_description": "As a Linux (Ubuntu) user, your task is to create a script that counts the total storage used by all regular files within the \"/var/log\" directory, in kilobytes. Your output should be an integer representing the total storage used by all these files in kilobytes.",
    "perturbation_strategy": {
      "strategy_id": "syntactic_ambiguity",
      "strategy_name": "Syntactic Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "EXECUTE",
      "prerequisites": [
        {
          "entity": "/var/log",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "directory_path",
            "type": "path",
            "value": "/var/log",
            "role": "target"
          },
          {
            "name": "file_type",
            "type": "string",
            "value": "regular",
            "role": "filter"
          },
          {
            "name": "unit",
            "type": "string",
            "value": "kilobytes",
            "role": "output_unit"
          },
          {
            "name": "command",
            "type": "command",
            "value": "find /var/log -type f -exec du -k {} + | awk '{sum += $1} END {print sum}'",
            "role": "command"
          }
        ],
        "optional": []
      },
      "expected_output": "integer representing the total storage used by all regular files in kilobytes"
    }
  },
  {
    "description": "Check all of them for it and tell me how many times you see it.",
    "explanation": "To solve this problem, you can use a combination of shell commands:\n\n1. Use 'grep' with the '-o' option, which will print only the matching part of the lines, and search for the keyword \"ERROR\".\n2. Pipe the output to 'wc -l' to count the number of lines, which will represent the total number of occurrences of the keyword \"ERROR\".",
    "create": {
      "init": "\nmkdir logs\necho \"INFO: Service started\" > logs/log1.txt\necho \"ERROR: Connection failed\" >> logs/log1.txt\necho \"ERROR: File not found\" >> logs/log1.txt\necho \"WARNING: High CPU usage\" >> logs/log1.txt\n\necho \"ERROR: Disk full\" > logs/log2.txt\necho \"INFO: Configuration updated\" >> logs/log2.txt\necho \"INFO: Service restarted\" >> logs/log2.txt\necho \"ERROR: Database connection lost\" >> logs/log2.txt"
    },
    "evaluation": {
      "check": [
        null,
        {
          "language": "python",
          "file": "check/integer-match.py"
        }
      ],
      "example": "grep -o 'ERROR' logs/* | wc -l"
    },
    "original_description": "As a student, you have been given a directory called \"logs\" containing multiple files. Each file contains lines representing status messages and logged events. You need to analyze these log files and answer the following question:\n\nHow many times the keyword \"ERROR\" appears across all the files in the \"logs\" directory?\n\nIn order to accomplish this task, you must create a single command line solution that processes the log files and returns the total count of occurrences of the keyword \"ERROR\" in the \"logs\" directory. Remember, the output must only contain a single integer representing the count of occurrences.",
    "perturbation_strategy": {
      "strategy_id": "referential_ambiguity",
      "strategy_name": "Referential Ambiguity",
      "flaw_category": "expression"
    },
    "true_intent": {
      "action_type": "COUNT",
      "prerequisites": [
        {
          "entity": "logs",
          "exists": true,
          "type": "directory"
        }
      ],
      "parameters": {
        "required": [
          {
            "name": "search_pattern",
            "type": "pattern",
            "value": "ERROR",
            "role": "search_pattern"
          },
          {
            "name": "command",
            "type": "command",
            "value": "grep",
            "role": "command"
          },
          {
            "name": "command",
            "type": "command",
            "value": "wc",
            "role": "command"
          },
          {
            "name": "path",
            "type": "path",
            "value": "logs/*",
            "role": "target"
          }
        ],
        "optional": []
      },
      "expected_output": "integer number representing the count of occurrences of 'ERROR'"
    }
  }
]