{
    "name": "retriever",
    "task_description": "\nYour task is to create a class called Retriever. This class will be used to retrieve similar vectors from a collection of vectors. You should follow the instructions below to complete this task.\n\n\nCreate an instance of the Retriever class by providing two arguments:\nvectors: A numpy array of vectors you want to analyze.\nk: An integer indicating the number of top similar vectors you want to retrieve.\nExample:\n\nfrom numpy import array\nvectors = array([[1, 2], [3, 4], [5, 6]])\nk = 2\nretriever = Retriever(vectors, k)\n\n\nSetting 'k' Value:\n\nUse the set_k method to update the value of k (number of top vectors to retrieve).\nThis method takes a single integer argument.\nThe value of k should be between 1 and the total number of vectors. If not, then the method should do nothing (do not raise an error).\nExample:\nretriever.set_k(3)\n\nAdding New Vectors:\n\nAdd additional vectors to your existing collection using the add_vectors method.\nThis method accepts a numpy array of new vectors to be added.\nExample:\n\nnew_vectors = array([[7, 8], [9, 10]])\nretriever.add_vectors(new_vectors)\n\n\nCalculating Distances:\n\nTo calculate the distance between a query vector and all stored vectors, use the distance method.\nThis method takes a single numpy array representing the query vector.\nIt returns a numpy array of distances.\nExample:\n\n\nquery_vector = array([1, 2])\ndistances = retriever.distance(query_vector)\n\n\nRetrieving Top 'k' Similar Vectors:\n\nUse the get_top_k_similar_vectors method to find the top 'k' vectors most similar to a given query vector.\nThis method takes a single numpy array as the query vector.\nIt returns a numpy array of the top 'k' similar vectors.\n\nExample:\n\ntop_vectors = retriever.get_top_k_similar_vectors(query_vector)\n\nGenerating a Similarity Matrix:\n\nTo create a similarity matrix between multiple queries and the stored vectors, use the get_similarity_matrix method.\nThis method accepts a numpy array of query vectors.\nIt returns a 2D numpy array where each row corresponds to the distances between a query vector and all stored vectors.\n\nExample:\n\nquery_vectors = array([[1, 2], [3, 4]])\nsimilarity_matrix = retriever.get_similarity_matrix(query_vectors)\n",
    "function_signature": "\nclass Retriever:\n",
    "unit_test": "\nimport numpy as np\n\n# Test Initialization\nvectors = np.array([[1, 2], [3, 4], [5, 6]])\nk = 2\nretriever = Retriever(vectors, k)\nassert (retriever.vectors == vectors).all() and retriever.k == k, \"Initialization Failed\"\n\n# Test set_k Method\nretriever.set_k(1)\nassert retriever.k == 1, \"set_k Method Failed\"\nretriever.set_k(0)  # Edge case\nassert retriever.k == 1, \"set_k Method Failed on Edge Case\"\n\n# Test add_vectors Method\nnew_vectors = np.array([[7, 8], [9, 10]])\nretriever.add_vectors(new_vectors)\nassert (retriever.vectors == np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])).all(), \"add_vectors Method Failed\"\n\n# Test distance Method\nquery = np.array([1, 2])\ndistances = retriever.distance(query)\nground_truth_distances = np.array([0, 2.82842712, 5.65685425, 8.48528137, 11.3137085])\nassert np.allclose(distances, ground_truth_distances, atol=1e-3), \"distance Method Failed\"\nassert len(distances) == len(retriever.vectors), \"distance Method Failed\"\n\n# Test get_top_k_similar_vectors Method\ntop_vectors = retriever.get_top_k_similar_vectors(query)\nground_truth_top_vectors = np.array([[1, 2]])\nassert (top_vectors == ground_truth_top_vectors).all(), \"get_top_k_similar_vectors Method Failed\"\nassert len(top_vectors) == retriever.k, \"get_top_k_similar_vectors Method Failed\"\n\n# Test get_similarity_matrix Method\nquery_vectors = np.array([[1, 2], [3, 4]])\nsimilarity_matrix = retriever.get_similarity_matrix(query_vectors)\n\nassert similarity_matrix.shape == (len(query_vectors), len(retriever.vectors)), \"get_similarity_matrix Method Failed\"\n",
    "solution": "\nimport numpy as np\nclass Retriever:\n    def __init__(self, vectors, k):\n        self.vectors = vectors\n        self.k = k\n\n    def set_k(self, k):\n        if k > len(self.vectors) or k < 1:\n            return\n        self.k = k\n\n    def add_vectors(self, new_vectors):\n        self.vectors = np.concatenate((self.vectors, new_vectors))\n        \n    def distance(self, query):\n        ''' \n        query: single numpy arrray\n        return: inverse l2 distances from query to the vectors\n        '''\n        distances = np.linalg.norm(self.vectors - query, axis=1)\n        return distances\n    \n    def get_top_k_similar_vectors(self, query):\n        '''\n        query: single numpy array\n        return: top k similar vectors\n        '''\n        scores = self.distance(query)\n        # np.argsort sorts in ascending order\n        indices_top = np.argsort(scores)\n        top_k_indices = indices_top[:self.k]\n        return self.vectors[top_k_indices]\n    \n    def get_similarity_matrix(self, queries):\n        '''\n        queries: numpy array of query vectors\n        return: similarity matrix of size (len(queries), len(self.vectors))\n        '''\n        similarity_matrix = []\n        for query in queries:\n            similarity_matrix.append(self.distance(query))\n        return np.array(similarity_matrix)\n",
    "type": "lengthy_code"
}