{"message": {"transcript": [{"chunks": [{"end": 0.68, "start": 0.0, "text": "This"}, {"end": 2.48, "start": 0.68, "text": "lecture"}, {"end": 6.84, "start": 2.48, "text": "is"}, {"end": 7.64, "start": 6.84, "text": "a"}, {"end": 8.32, "start": 7.64, "text": "continued"}, {"end": 8.84, "start": 8.32, "text": "discussion"}, {"end": 9.48, "start": 8.84, "text": "of"}, {"end": 10.08, "start": 9.48, "text": "evaluation"}, {"end": 10.12, "start": 10.08, "text": "of"}, {"end": 10.56, "start": 10.12, "text": "text"}, {"end": 12.0, "start": 10.56, "text": "categorization."}, {"end": 12.6, "start": 12.0, "text": "Earlier"}, {"end": 13.16, "start": 12.6, "text": "we"}, {"end": 13.56, "start": 13.16, "text": "have"}, {"end": 14.44, "start": 13.56, "text": "introduced"}, {"end": 15.24, "start": 14.44, "text": "measures"}, {"end": 15.4, "start": 15.24, "text": "that"}, {"end": 15.88, "start": 15.4, "text": "can"}, {"end": 16.0, "start": 15.88, "text": "be"}, {"end": 16.6, "start": 16.0, "text": "used"}, {"end": 16.84, "start": 16.6, "text": "to"}, {"end": 17.72, "start": 16.84, "text": "compute"}, {"end": 18.6, "start": 17.72, "text": "precision"}, {"end": 19.28, "start": 18.6, "text": "and"}, {"end": 19.84, "start": 19.28, "text": "recall"}, {"end": 20.48, "start": 19.84, "text": "for"}, {"end": 20.68, "start": 20.48, "text": "each"}, {"end": 22.52, "start": 20.68, "text": "category"}, {"end": 23.04, "start": 22.52, "text": "and"}, {"end": 23.2, "start": 23.04, "text": "each"}, {"end": 24.4, "start": 23.2, "text": "document."}, {"end": 24.8, "start": 24.4, "text": "Now"}, {"end": 24.88, "start": 24.8, "text": "in"}, {"end": 25.12, "start": 24.88, "text": "this"}, {"end": 25.68, "start": 25.12, "text": "lecture"}, {"end": 26.16, "start": 25.68, "text": "we're"}, {"end": 26.96, "start": 26.16, "text": "going"}, {"end": 27.52, "start": 26.96, "text": "to"}, {"end": 28.24, "start": 27.52, "text": "further"}, {"end": 28.6, "start": 28.24, "text": "examine"}, {"end": 29.04, "start": 28.6, "text": "how"}, {"end": 29.2, "start": 29.04, "text": "to"}, {"end": 29.76, "start": 29.2, "text": "combine"}, {"end": 29.96, "start": 29.76, "text": "the"}], "text": " This lecture is a continued discussion of evaluation of text categorization. Earlier we have introduced measures that can be used to compute precision and recall for each category and each document. Now in this lecture we're going to further examine how to combine the"}, {"chunks": [{"end": 31.12, "start": 30.0, "text": "performance"}, {"end": 31.36, "start": 31.12, "text": "on"}, {"end": 31.72, "start": 31.36, "text": "these"}, {"end": 31.88, "start": 31.72, "text": "different"}, {"end": 32.4, "start": 31.88, "text": "categories"}, {"end": 32.56, "start": 32.4, "text": "or"}, {"end": 32.76, "start": 32.56, "text": "different"}, {"end": 33.56, "start": 32.76, "text": "documents."}, {"end": 34.2, "start": 33.56, "text": "How"}, {"end": 34.480000000000004, "start": 34.2, "text": "do"}, {"end": 34.480000000000004, "start": 34.480000000000004, "text": "we"}, {"end": 34.519999999999996, "start": 34.480000000000004, "text": "aggregate"}, {"end": 34.96, "start": 34.519999999999996, "text": "them?"}, {"end": 35.480000000000004, "start": 34.96, "text": "How"}, {"end": 35.480000000000004, "start": 35.480000000000004, "text": "do"}, {"end": 35.480000000000004, "start": 35.480000000000004, "text": "we"}, {"end": 35.84, "start": 35.480000000000004, "text": "take"}, {"end": 36.28, "start": 35.84, "text": "average?"}, {"end": 36.4, "start": 36.28, "text": "You"}, {"end": 36.8, "start": 36.4, "text": "see"}, {"end": 36.8, "start": 36.8, "text": "on"}, {"end": 36.8, "start": 36.8, "text": "the"}, {"end": 37.68, "start": 36.8, "text": "title"}, {"end": 38.28, "start": 37.68, "text": "here"}, {"end": 38.64, "start": 38.28, "text": "I"}, {"end": 39.32, "start": 38.64, "text": "indicated"}, {"end": 39.519999999999996, "start": 39.32, "text": "it's"}, {"end": 39.96, "start": 39.519999999999996, "text": "called"}, {"end": 40.04, "start": 39.96, "text": "a"}, {"end": 40.64, "start": 40.04, "text": "macro"}, {"end": 40.92, "start": 40.64, "text": "average"}, {"end": 40.92, "start": 40.92, "text": "and"}, {"end": 41.2, "start": 40.92, "text": "this"}, {"end": 41.88, "start": 41.2, "text": "is"}, {"end": 42.04, "start": 41.88, "text": "in"}, {"end": 42.8, "start": 42.04, "text": "contrast"}, {"end": 43.16, "start": 42.8, "text": "to"}, {"end": 43.92, "start": 43.16, "text": "micro"}, {"end": 44.6, "start": 43.92, "text": "average"}, {"end": 44.92, "start": 44.6, "text": "that"}, {"end": 44.92, "start": 44.92, "text": "we'll"}, {"end": 45.16, "start": 44.92, "text": "talk"}, {"end": 45.64, "start": 45.16, "text": "more"}, {"end": 45.92, "start": 45.64, "text": "about"}, {"end": 46.8, "start": 45.92, "text": "later."}, {"end": 47.2, "start": 46.8, "text": "So"}, {"end": 47.64, "start": 47.2, "text": "again"}, {"end": 48.4, "start": 47.64, "text": "for"}, {"end": 48.68, "start": 48.4, "text": "each"}, {"end": 49.32, "start": 48.68, "text": "category"}, {"end": 49.760000000000005, "start": 49.32, "text": "we"}, {"end": 50.2, "start": 49.760000000000005, "text": "can"}, {"end": 50.239999999999995, "start": 50.2, "text": "compute"}, {"end": 50.519999999999996, "start": 50.239999999999995, "text": "the"}, {"end": 51.519999999999996, "start": 50.519999999999996, "text": "precision"}, {"end": 51.879999999999995, "start": 51.519999999999996, "text": "recall"}, {"end": 52.16, "start": 51.879999999999995, "text": "and"}, {"end": 53.16, "start": 52.16, "text": "F1."}, {"end": 53.68, "start": 53.16, "text": "So"}, {"end": 53.84, "start": 53.68, "text": "for"}, {"end": 54.16, "start": 53.84, "text": "example"}, {"end": 54.4, "start": 54.16, "text": "for"}, {"end": 54.84, "start": 54.4, "text": "category"}, {"end": 55.480000000000004, "start": 54.84, "text": "C1"}, {"end": 55.64, "start": 55.480000000000004, "text": "we"}, {"end": 56.120000000000005, "start": 55.64, "text": "have"}, {"end": 56.64, "start": 56.120000000000005, "text": "precision"}, {"end": 57.519999999999996, "start": 56.64, "text": "P1"}, {"end": 58.120000000000005, "start": 57.519999999999996, "text": "recall"}, {"end": 58.84, "start": 58.120000000000005, "text": "R1"}, {"end": 59.28, "start": 58.84, "text": "and"}, {"end": 59.96, "start": 59.28, "text": "F1"}], "text": " performance on these different categories or different documents. How do we aggregate them? How do we take average? You see on the title here I indicated it's called a macro average and this is in contrast to micro average that we'll talk more about later. So again for each category we can compute the precision recall and F1. So for example for category C1 we have precision P1 recall R1 and F1"}, {"chunks": [{"end": 60.2, "start": 60.0, "text": "and"}, {"end": 60.32, "start": 60.2, "text": "similarly"}, {"end": 60.44, "start": 60.32, "text": "we"}, {"end": 61.36, "start": 60.44, "text": "can"}, {"end": 61.4, "start": 61.36, "text": "do"}, {"end": 62.0, "start": 61.4, "text": "that"}, {"end": 62.72, "start": 62.0, "text": "for"}, {"end": 63.68, "start": 62.72, "text": "category"}, {"end": 64.16, "start": 63.68, "text": "2"}, {"end": 64.16, "start": 64.16, "text": "and"}, {"end": 64.52, "start": 64.16, "text": "all"}, {"end": 64.96, "start": 64.52, "text": "the"}, {"end": 65.4, "start": 64.96, "text": "other"}, {"end": 66.28, "start": 65.4, "text": "categories."}, {"end": 66.76, "start": 66.28, "text": "Once"}, {"end": 67.0, "start": 66.76, "text": "we"}, {"end": 67.28, "start": 67.0, "text": "compute"}, {"end": 67.84, "start": 67.28, "text": "that,"}, {"end": 68.16, "start": 67.84, "text": "we"}, {"end": 68.92, "start": 68.16, "text": "can"}, {"end": 69.72, "start": 68.92, "text": "aggregate"}, {"end": 70.36, "start": 69.72, "text": "them."}, {"end": 70.68, "start": 70.36, "text": "So"}, {"end": 70.96000000000001, "start": 70.68, "text": "for"}, {"end": 71.52, "start": 70.96000000000001, "text": "example,"}, {"end": 71.68, "start": 71.52, "text": "we"}, {"end": 71.84, "start": 71.68, "text": "can"}, {"end": 72.48, "start": 71.84, "text": "aggregate"}, {"end": 72.76, "start": 72.48, "text": "all"}, {"end": 72.92, "start": 72.76, "text": "the"}, {"end": 73.32, "start": 72.92, "text": "precision"}, {"end": 73.8, "start": 73.32, "text": "values"}, {"end": 74.32, "start": 73.8, "text": "for"}, {"end": 74.6, "start": 74.32, "text": "all"}, {"end": 74.6, "start": 74.6, "text": "the"}, {"end": 75.12, "start": 74.6, "text": "categories"}, {"end": 75.36, "start": 75.12, "text": "to"}, {"end": 75.88, "start": 75.36, "text": "compute"}, {"end": 76.2, "start": 75.88, "text": "the"}, {"end": 76.76, "start": 76.2, "text": "overall"}, {"end": 77.32, "start": 76.76, "text": "precision."}, {"end": 77.32, "start": 77.32, "text": "And"}, {"end": 77.64, "start": 77.32, "text": "this"}, {"end": 78.08, "start": 77.64, "text": "is"}, {"end": 78.44, "start": 78.08, "text": "often"}, {"end": 78.76, "start": 78.44, "text": "very"}, {"end": 79.48, "start": 78.76, "text": "useful"}, {"end": 79.64, "start": 79.48, "text": "to"}, {"end": 80.16, "start": 79.64, "text": "summarize"}, {"end": 80.48, "start": 80.16, "text": "what"}, {"end": 80.96000000000001, "start": 80.48, "text": "we"}, {"end": 81.92, "start": 80.96000000000001, "text": "have"}, {"end": 82.24, "start": 81.92, "text": "seen"}, {"end": 82.4, "start": 82.24, "text": "in"}, {"end": 82.96000000000001, "start": 82.4, "text": "the"}, {"end": 83.44, "start": 82.96000000000001, "text": "whole"}, {"end": 84.0, "start": 83.44, "text": "dataset."}, {"end": 84.03999999999999, "start": 84.0, "text": "And"}, {"end": 84.52, "start": 84.03999999999999, "text": "the"}, {"end": 85.16, "start": 84.52, "text": "aggregation"}, {"end": 85.44, "start": 85.16, "text": "can"}, {"end": 85.44, "start": 85.44, "text": "be"}, {"end": 85.8, "start": 85.44, "text": "done"}, {"end": 85.88, "start": 85.8, "text": "in"}, {"end": 85.92, "start": 85.88, "text": "many"}, {"end": 86.2, "start": 85.92, "text": "different"}, {"end": 86.84, "start": 86.2, "text": "ways."}, {"end": 87.08, "start": 86.84, "text": "Again,"}, {"end": 87.6, "start": 87.08, "text": "as"}, {"end": 88.03999999999999, "start": 87.6, "text": "I"}, {"end": 88.36, "start": 88.03999999999999, "text": "said,"}, {"end": 88.64, "start": 88.36, "text": "in"}, {"end": 89.03999999999999, "start": 88.64, "text": "case"}, {"end": 89.96000000000001, "start": 89.03999999999999, "text": "when"}], "text": " and similarly we can do that for category 2 and all the other categories. Once we compute that, we can aggregate them. So for example, we can aggregate all the precision values for all the categories to compute the overall precision. And this is often very useful to summarize what we have seen in the whole dataset. And the aggregation can be done in many different ways. Again, as I said, in case when"}, {"chunks": [{"end": 90.6, "start": 90.0, "text": "You"}, {"end": 91.08, "start": 90.6, "text": "need"}, {"end": 91.88, "start": 91.08, "text": "to"}, {"end": 92.48, "start": 91.88, "text": "aggregate"}, {"end": 93.28, "start": 92.48, "text": "different"}, {"end": 94.08, "start": 93.28, "text": "values."}, {"end": 94.44, "start": 94.08, "text": "It's"}, {"end": 94.88, "start": 94.44, "text": "always"}, {"end": 94.88, "start": 94.88, "text": "good"}, {"end": 94.88, "start": 94.88, "text": "to"}, {"end": 95.24, "start": 94.88, "text": "think"}, {"end": 95.6, "start": 95.24, "text": "about"}, {"end": 96.04, "start": 95.6, "text": "what's"}, {"end": 96.04, "start": 96.04, "text": "the"}, {"end": 96.04, "start": 96.04, "text": "best"}, {"end": 96.04, "start": 96.04, "text": "way"}, {"end": 96.04, "start": 96.04, "text": "of"}, {"end": 96.04, "start": 96.04, "text": "doing"}, {"end": 96.08, "start": 96.04, "text": "the"}, {"end": 96.56, "start": 96.08, "text": "aggregation."}, {"end": 96.68, "start": 96.56, "text": "For"}, {"end": 97.32, "start": 96.68, "text": "example,"}, {"end": 97.32, "start": 97.32, "text": "you"}, {"end": 97.4, "start": 97.32, "text": "can"}, {"end": 97.88, "start": 97.4, "text": "consider"}, {"end": 98.48, "start": 97.88, "text": "arithmetic"}, {"end": 98.72, "start": 98.48, "text": "mean,"}, {"end": 98.88, "start": 98.72, "text": "which"}, {"end": 99.08, "start": 98.88, "text": "is"}, {"end": 99.44, "start": 99.08, "text": "very"}, {"end": 100.12, "start": 99.44, "text": "commonly"}, {"end": 100.64, "start": 100.12, "text": "used,"}, {"end": 100.84, "start": 100.64, "text": "or"}, {"end": 101.2, "start": 100.84, "text": "you"}, {"end": 102.16, "start": 101.2, "text": "can"}, {"end": 102.96000000000001, "start": 102.16, "text": "use"}, {"end": 103.72, "start": 102.96000000000001, "text": "geometric"}, {"end": 103.76, "start": 103.72, "text": "mean,"}, {"end": 104.0, "start": 103.76, "text": "which"}, {"end": 104.2, "start": 104.0, "text": "would"}, {"end": 104.64, "start": 104.2, "text": "have"}, {"end": 105.03999999999999, "start": 104.64, "text": "different"}, {"end": 105.76, "start": 105.03999999999999, "text": "behavior."}, {"end": 106.03999999999999, "start": 105.76, "text": "Depending"}, {"end": 106.03999999999999, "start": 106.03999999999999, "text": "on"}, {"end": 106.12, "start": 106.03999999999999, "text": "the"}, {"end": 107.28, "start": 106.12, "text": "way"}, {"end": 107.56, "start": 107.28, "text": "you"}, {"end": 108.03999999999999, "start": 107.56, "text": "aggregate,"}, {"end": 108.16, "start": 108.03999999999999, "text": "you"}, {"end": 108.6, "start": 108.16, "text": "might"}, {"end": 108.88, "start": 108.6, "text": "get"}, {"end": 109.36, "start": 108.88, "text": "different"}, {"end": 110.44, "start": 109.36, "text": "conclusions"}, {"end": 110.6, "start": 110.44, "text": "in"}, {"end": 111.0, "start": 110.6, "text": "terms"}, {"end": 111.08, "start": 111.0, "text": "of"}, {"end": 111.56, "start": 111.08, "text": "which"}, {"end": 112.08, "start": 111.56, "text": "method"}, {"end": 112.64, "start": 112.08, "text": "works"}, {"end": 112.76, "start": 112.64, "text": "better."}, {"end": 112.76, "start": 112.76, "text": "So"}, {"end": 113.08, "start": 112.76, "text": "it's"}, {"end": 113.6, "start": 113.08, "text": "important"}, {"end": 113.6, "start": 113.6, "text": "to"}, {"end": 114.03999999999999, "start": 113.6, "text": "consider"}, {"end": 114.32, "start": 114.03999999999999, "text": "these"}, {"end": 115.12, "start": 114.32, "text": "differences"}, {"end": 115.6, "start": 115.12, "text": "and"}, {"end": 116.2, "start": 115.6, "text": "choosing"}, {"end": 116.6, "start": 116.2, "text": "the"}, {"end": 116.88, "start": 116.6, "text": "right"}, {"end": 117.32, "start": 116.88, "text": "one"}, {"end": 117.64, "start": 117.32, "text": "or"}, {"end": 118.44, "start": 117.64, "text": "more"}, {"end": 119.16, "start": 118.44, "text": "suitable"}, {"end": 119.28, "start": 119.16, "text": "one"}, {"end": 119.28, "start": 119.28, "text": "for"}, {"end": 119.32, "start": 119.28, "text": "your"}, {"end": 119.96000000000001, "start": 119.32, "text": "task."}], "text": " You need to aggregate different values. It's always good to think about what's the best way of doing the aggregation. For example, you can consider arithmetic mean, which is very commonly used, or you can use geometric mean, which would have different behavior. Depending on the way you aggregate, you might get different conclusions in terms of which method works better. So it's important to consider these differences and choosing the right one or more suitable one for your task."}, {"chunks": [{"end": 120.8, "start": 120.0, "text": "So"}, {"end": 121.4, "start": 120.8, "text": "the"}, {"end": 121.76, "start": 121.4, "text": "difference,"}, {"end": 121.88, "start": 121.76, "text": "for"}, {"end": 122.32, "start": 121.88, "text": "example,"}, {"end": 122.6, "start": 122.32, "text": "between"}, {"end": 123.4, "start": 122.6, "text": "arithmetic"}, {"end": 123.48, "start": 123.4, "text": "mean"}, {"end": 124.08, "start": 123.48, "text": "and"}, {"end": 124.68, "start": 124.08, "text": "geometric"}, {"end": 124.84, "start": 124.68, "text": "mean"}, {"end": 125.12, "start": 124.84, "text": "is"}, {"end": 125.52, "start": 125.12, "text": "that"}, {"end": 125.56, "start": 125.52, "text": "the"}, {"end": 125.8, "start": 125.56, "text": "arithmetic"}, {"end": 125.92, "start": 125.8, "text": "mean"}, {"end": 126.68, "start": 125.92, "text": "would"}, {"end": 127.32, "start": 126.68, "text": "be"}, {"end": 127.8, "start": 127.32, "text": "dominated"}, {"end": 128.08, "start": 127.8, "text": "by"}, {"end": 128.56, "start": 128.08, "text": "high"}, {"end": 129.2, "start": 128.56, "text": "values,"}, {"end": 129.84, "start": 129.2, "text": "whereas"}, {"end": 130.36, "start": 129.84, "text": "geometric"}, {"end": 130.48, "start": 130.36, "text": "mean"}, {"end": 130.64, "start": 130.48, "text": "would"}, {"end": 130.76, "start": 130.64, "text": "be"}, {"end": 130.92, "start": 130.76, "text": "more"}, {"end": 131.24, "start": 130.92, "text": "affected"}, {"end": 131.24, "start": 131.24, "text": "by"}, {"end": 131.28, "start": 131.24, "text": "low"}, {"end": 131.92, "start": 131.28, "text": "values."}, {"end": 132.6, "start": 131.92, "text": "And"}, {"end": 133.2, "start": 132.6, "text": "so"}, {"end": 133.56, "start": 133.2, "text": "whether"}, {"end": 133.8, "start": 133.56, "text": "you"}, {"end": 134.16, "start": 133.8, "text": "want"}, {"end": 134.32, "start": 134.16, "text": "to"}, {"end": 135.48, "start": 134.32, "text": "emphasize"}, {"end": 135.92, "start": 135.48, "text": "low"}, {"end": 136.68, "start": 135.92, "text": "values"}, {"end": 136.88, "start": 136.68, "text": "or"}, {"end": 137.04, "start": 136.88, "text": "high"}, {"end": 137.72, "start": 137.04, "text": "values"}, {"end": 137.84, "start": 137.72, "text": "would"}, {"end": 138.04, "start": 137.84, "text": "be"}, {"end": 138.24, "start": 138.04, "text": "a"}, {"end": 139.6, "start": 138.24, "text": "question"}, {"end": 139.96, "start": 139.6, "text": "related"}, {"end": 140.04, "start": 139.96, "text": "to"}, {"end": 140.4, "start": 140.04, "text": "your"}, {"end": 140.92000000000002, "start": 140.4, "text": "application."}, {"end": 141.04, "start": 140.92000000000002, "text": "And"}, {"end": 141.44, "start": 141.04, "text": "similarly,"}, {"end": 141.44, "start": 141.44, "text": "we"}, {"end": 142.0, "start": 141.44, "text": "can"}, {"end": 142.0, "start": 142.0, "text": "do"}, {"end": 142.04, "start": 142.0, "text": "that"}, {"end": 142.8, "start": 142.04, "text": "for"}, {"end": 143.6, "start": 142.8, "text": "recall"}, {"end": 144.12, "start": 143.6, "text": "and"}, {"end": 144.68, "start": 144.12, "text": "F-score."}, {"end": 144.96, "start": 144.68, "text": "So"}, {"end": 145.36, "start": 144.96, "text": "that's"}, {"end": 145.92000000000002, "start": 145.36, "text": "how"}, {"end": 145.92000000000002, "start": 145.92000000000002, "text": "we"}, {"end": 145.92000000000002, "start": 145.92000000000002, "text": "can"}, {"end": 145.96, "start": 145.92000000000002, "text": "then"}, {"end": 146.28, "start": 145.96, "text": "generate"}, {"end": 146.76, "start": 146.28, "text": "the"}, {"end": 147.44, "start": 146.76, "text": "overall"}, {"end": 148.16, "start": 147.44, "text": "precision"}, {"end": 148.44, "start": 148.16, "text": "recall"}, {"end": 148.96, "start": 148.44, "text": "and"}, {"end": 149.96, "start": 148.96, "text": "F-score."}], "text": " So the difference, for example, between arithmetic mean and geometric mean is that the arithmetic mean would be dominated by high values, whereas geometric mean would be more affected by low values. And so whether you want to emphasize low values or high values would be a question related to your application. And similarly, we can do that for recall and F-score. So that's how we can then generate the overall precision recall and F-score."}, {"chunks": [{"end": 150.04, "start": 150.0, "text": "Now,"}, {"end": 150.24, "start": 150.04, "text": "we"}, {"end": 150.6, "start": 150.24, "text": "can"}, {"end": 151.0, "start": 150.6, "text": "do"}, {"end": 151.28, "start": 151.0, "text": "the"}, {"end": 151.56, "start": 151.28, "text": "same"}, {"end": 153.16, "start": 151.56, "text": "for"}, {"end": 153.96, "start": 153.16, "text": "aggregation"}, {"end": 154.88, "start": 153.96, "text": "over"}, {"end": 155.32, "start": 154.88, "text": "all"}, {"end": 155.76, "start": 155.32, "text": "the"}, {"end": 156.44, "start": 155.76, "text": "documents."}, {"end": 156.84, "start": 156.44, "text": "It's"}, {"end": 157.76, "start": 156.84, "text": "exactly"}, {"end": 158.24, "start": 157.76, "text": "the"}, {"end": 158.48, "start": 158.24, "text": "same"}, {"end": 159.0, "start": 158.48, "text": "situation"}, {"end": 159.2, "start": 159.0, "text": "for"}, {"end": 159.28, "start": 159.2, "text": "each"}, {"end": 160.0, "start": 159.28, "text": "document,"}, {"end": 160.24, "start": 160.0, "text": "we'll"}, {"end": 160.28, "start": 160.24, "text": "compute"}, {"end": 160.72, "start": 160.28, "text": "precision"}, {"end": 161.28, "start": 160.72, "text": "recall"}, {"end": 161.84, "start": 161.28, "text": "and"}, {"end": 162.08, "start": 161.84, "text": "F."}, {"end": 162.24, "start": 162.08, "text": "And"}, {"end": 162.48, "start": 162.24, "text": "then"}, {"end": 162.8, "start": 162.48, "text": "after"}, {"end": 163.4, "start": 162.8, "text": "we"}, {"end": 164.2, "start": 163.4, "text": "have"}, {"end": 164.84, "start": 164.2, "text": "completed"}, {"end": 165.28, "start": 164.84, "text": "the"}, {"end": 165.68, "start": 165.28, "text": "computation"}, {"end": 166.32, "start": 165.68, "text": "for"}, {"end": 166.68, "start": 166.32, "text": "all"}, {"end": 166.96, "start": 166.68, "text": "these"}, {"end": 167.12, "start": 166.96, "text": "documents,"}, {"end": 167.48, "start": 167.12, "text": "we're"}, {"end": 167.92000000000002, "start": 167.48, "text": "going"}, {"end": 168.04, "start": 167.92000000000002, "text": "to"}, {"end": 168.32, "start": 168.04, "text": "aggregate"}, {"end": 168.32, "start": 168.32, "text": "them"}, {"end": 168.36, "start": 168.32, "text": "to"}, {"end": 168.48, "start": 168.36, "text": "generate"}, {"end": 168.8, "start": 168.48, "text": "the"}, {"end": 169.36, "start": 168.8, "text": "overall"}, {"end": 170.24, "start": 169.36, "text": "precision,"}, {"end": 170.6, "start": 170.24, "text": "overall"}, {"end": 171.0, "start": 170.6, "text": "recall"}, {"end": 171.28, "start": 171.0, "text": "and"}, {"end": 171.96, "start": 171.28, "text": "overall"}, {"end": 172.28, "start": 171.96, "text": "F"}, {"end": 172.88, "start": 172.28, "text": "score."}, {"end": 173.2, "start": 172.88, "text": "These"}, {"end": 173.24, "start": 173.2, "text": "are"}, {"end": 173.44, "start": 173.24, "text": "again"}, {"end": 174.64, "start": 173.44, "text": "examining"}, {"end": 175.28, "start": 174.64, "text": "the"}, {"end": 175.68, "start": 175.28, "text": "results"}, {"end": 175.88, "start": 175.68, "text": "from"}, {"end": 176.16, "start": 175.88, "text": "different"}, {"end": 176.96, "start": 176.16, "text": "angles"}, {"end": 176.96, "start": 176.96, "text": "and"}, {"end": 177.32, "start": 176.96, "text": "which"}, {"end": 177.36, "start": 177.32, "text": "one"}, {"end": 177.56, "start": 177.36, "text": "is"}, {"end": 178.04, "start": 177.56, "text": "more"}, {"end": 178.76, "start": 178.04, "text": "useful"}, {"end": 178.8, "start": 178.76, "text": "will"}, {"end": 179.0, "start": 178.8, "text": "depend"}, {"end": 179.16, "start": 179.0, "text": "on"}, {"end": 179.36, "start": 179.16, "text": "your"}, {"end": 179.96, "start": 179.36, "text": "application."}], "text": " Now, we can do the same for aggregation over all the documents. It's exactly the same situation for each document, we'll compute precision recall and F. And then after we have completed the computation for all these documents, we're going to aggregate them to generate the overall precision, overall recall and overall F score. These are again examining the results from different angles and which one is more useful will depend on your application."}, {"chunks": [{"end": 180.52, "start": 180.0, "text": "In"}, {"end": 181.12, "start": 180.52, "text": "general,"}, {"end": 182.12, "start": 181.12, "text": "it's"}, {"end": 182.68, "start": 182.12, "text": "beneficial"}, {"end": 182.72, "start": 182.68, "text": "to"}, {"end": 183.32, "start": 182.72, "text": "look"}, {"end": 183.52, "start": 183.32, "text": "at"}, {"end": 183.6, "start": 183.52, "text": "the"}, {"end": 184.08, "start": 183.6, "text": "results"}, {"end": 184.24, "start": 184.08, "text": "from"}, {"end": 184.52, "start": 184.24, "text": "all"}, {"end": 184.8, "start": 184.52, "text": "these"}, {"end": 186.0, "start": 184.8, "text": "perspectives."}, {"end": 186.48, "start": 186.0, "text": "And"}, {"end": 187.16, "start": 186.48, "text": "especially"}, {"end": 187.36, "start": 187.16, "text": "if"}, {"end": 187.36, "start": 187.36, "text": "you"}, {"end": 187.44, "start": 187.36, "text": "compare"}, {"end": 187.96, "start": 187.44, "text": "different"}, {"end": 188.64, "start": 187.96, "text": "methods"}, {"end": 189.48, "start": 188.64, "text": "in"}, {"end": 189.72, "start": 189.48, "text": "different"}, {"end": 190.68, "start": 189.72, "text": "dimensions,"}, {"end": 190.96, "start": 190.68, "text": "it"}, {"end": 191.36, "start": 190.96, "text": "might"}, {"end": 192.12, "start": 191.36, "text": "reveal"}, {"end": 192.72, "start": 192.12, "text": "which"}, {"end": 193.68, "start": 192.72, "text": "method"}, {"end": 193.88, "start": 193.68, "text": "is"}, {"end": 194.24, "start": 193.88, "text": "better"}, {"end": 194.52, "start": 194.24, "text": "in"}, {"end": 195.56, "start": 194.52, "text": "which"}, {"end": 196.0, "start": 195.56, "text": "measure"}, {"end": 196.4, "start": 196.0, "text": "or"}, {"end": 196.48, "start": 196.4, "text": "in"}, {"end": 196.76, "start": 196.48, "text": "what"}, {"end": 197.68, "start": 196.76, "text": "situations."}, {"end": 198.0, "start": 197.68, "text": "And"}, {"end": 198.36, "start": 198.0, "text": "this"}, {"end": 198.68, "start": 198.36, "text": "provides"}, {"end": 199.16, "start": 198.68, "text": "insight"}, {"end": 199.76, "start": 199.16, "text": "for"}, {"end": 200.56, "start": 199.76, "text": "understanding"}, {"end": 200.68, "start": 200.56, "text": "the"}, {"end": 201.48, "start": 200.68, "text": "strengths"}, {"end": 201.96, "start": 201.48, "text": "of"}, {"end": 202.24, "start": 201.96, "text": "a"}, {"end": 202.8, "start": 202.24, "text": "method"}, {"end": 202.84, "start": 202.8, "text": "or"}, {"end": 203.36, "start": 202.84, "text": "weakness,"}, {"end": 203.88, "start": 203.36, "text": "and"}, {"end": 204.2, "start": 203.88, "text": "this"}, {"end": 204.48, "start": 204.2, "text": "provides"}, {"end": 204.84, "start": 204.48, "text": "further"}, {"end": 206.6, "start": 204.84, "text": "insight"}, {"end": 206.84, "start": 206.6, "text": "for"}, {"end": 207.44, "start": 206.84, "text": "improving"}, {"end": 208.16, "start": 207.44, "text": "them."}, {"end": 208.92000000000002, "start": 208.16, "text": "So"}, {"end": 209.4, "start": 208.92000000000002, "text": "as"}, {"end": 209.6, "start": 209.4, "text": "I"}, {"end": 209.96, "start": 209.6, "text": "mentioned,"}], "text": " In general, it's beneficial to look at the results from all these perspectives. And especially if you compare different methods in different dimensions, it might reveal which method is better in which measure or in what situations. And this provides insight for understanding the strengths of a method or weakness, and this provides further insight for improving them. So as I mentioned,"}, {"chunks": [{"end": 210.2, "start": 210.0, "text": "There"}, {"end": 210.48, "start": 210.2, "text": "is"}, {"end": 210.84, "start": 210.48, "text": "also"}, {"end": 211.52, "start": 210.84, "text": "micro"}, {"end": 211.96, "start": 211.52, "text": "averaging"}, {"end": 212.08, "start": 211.96, "text": "in"}, {"end": 212.8, "start": 212.08, "text": "contrast"}, {"end": 212.8, "start": 212.8, "text": "to"}, {"end": 212.96, "start": 212.8, "text": "the"}, {"end": 213.44, "start": 212.96, "text": "macro"}, {"end": 213.68, "start": 213.44, "text": "averaging"}, {"end": 213.92, "start": 213.68, "text": "that"}, {"end": 214.04, "start": 213.92, "text": "we"}, {"end": 214.68, "start": 214.04, "text": "talked"}, {"end": 215.4, "start": 214.68, "text": "about"}, {"end": 215.76, "start": 215.4, "text": "earlier."}, {"end": 215.84, "start": 215.76, "text": "In"}, {"end": 216.32, "start": 215.84, "text": "this"}, {"end": 216.88, "start": 216.32, "text": "case,"}, {"end": 217.88, "start": 216.88, "text": "what"}, {"end": 218.0, "start": 217.88, "text": "we"}, {"end": 218.0, "start": 218.0, "text": "do"}, {"end": 218.0, "start": 218.0, "text": "is"}, {"end": 218.08, "start": 218.0, "text": "to"}, {"end": 218.56, "start": 218.08, "text": "pull"}, {"end": 218.96, "start": 218.56, "text": "together"}, {"end": 219.24, "start": 218.96, "text": "all"}, {"end": 219.32, "start": 219.24, "text": "the"}, {"end": 220.28, "start": 219.32, "text": "decisions"}, {"end": 220.96, "start": 220.28, "text": "and"}, {"end": 221.88, "start": 220.96, "text": "then"}, {"end": 222.12, "start": 221.88, "text": "compute"}, {"end": 222.76, "start": 222.12, "text": "the"}, {"end": 223.32, "start": 222.76, "text": "precision"}, {"end": 223.88, "start": 223.32, "text": "and"}, {"end": 224.68, "start": 223.88, "text": "recall."}, {"end": 225.0, "start": 224.68, "text": "So"}, {"end": 225.32, "start": 225.0, "text": "we"}, {"end": 225.96, "start": 225.32, "text": "can"}, {"end": 226.32, "start": 225.96, "text": "compute"}, {"end": 226.4, "start": 226.32, "text": "the"}, {"end": 227.12, "start": 226.4, "text": "overall"}, {"end": 228.0, "start": 227.12, "text": "precision"}, {"end": 228.8, "start": 228.0, "text": "and"}, {"end": 229.12, "start": 228.8, "text": "recall"}, {"end": 229.64, "start": 229.12, "text": "by"}, {"end": 229.96, "start": 229.64, "text": "just"}, {"end": 230.44, "start": 229.96, "text": "counting"}, {"end": 230.72, "start": 230.44, "text": "how"}, {"end": 231.0, "start": 230.72, "text": "many"}, {"end": 231.68, "start": 231.0, "text": "cases"}, {"end": 232.12, "start": 231.68, "text": "are"}, {"end": 232.36, "start": 232.12, "text": "in"}, {"end": 232.56, "start": 232.36, "text": "true"}, {"end": 233.07999999999998, "start": 232.56, "text": "positive,"}, {"end": 233.32, "start": 233.07999999999998, "text": "how"}, {"end": 233.48, "start": 233.32, "text": "many"}, {"end": 234.12, "start": 233.48, "text": "cases"}, {"end": 234.84, "start": 234.12, "text": "in"}, {"end": 235.2, "start": 234.84, "text": "false"}, {"end": 235.72, "start": 235.2, "text": "positive,"}, {"end": 235.84, "start": 235.72, "text": "et"}, {"end": 236.36, "start": 235.84, "text": "cetera,"}, {"end": 236.76, "start": 236.36, "text": "basically"}, {"end": 237.32, "start": 236.76, "text": "computing"}, {"end": 237.32, "start": 237.32, "text": "the"}, {"end": 237.44, "start": 237.32, "text": "values"}, {"end": 237.76, "start": 237.44, "text": "to"}, {"end": 238.04, "start": 237.76, "text": "fill"}, {"end": 238.12, "start": 238.04, "text": "in"}, {"end": 239.07999999999998, "start": 238.12, "text": "this"}, {"end": 239.96, "start": 239.07999999999998, "text": "continuum."}], "text": " There is also micro averaging in contrast to the macro averaging that we talked about earlier. In this case, what we do is to pull together all the decisions and then compute the precision and recall. So we can compute the overall precision and recall by just counting how many cases are in true positive, how many cases in false positive, et cetera, basically computing the values to fill in this continuum."}, {"chunks": [{"end": 240.0, "start": 240.0, "text": "In"}, {"end": 240.0, "start": 240.0, "text": "the"}, {"end": 240.0, "start": 240.0, "text": "table"}, {"end": 240.2, "start": 240.0, "text": "and"}, {"end": 240.44, "start": 240.2, "text": "then"}, {"end": 240.48, "start": 240.44, "text": "we"}, {"end": 240.56, "start": 240.48, "text": "can"}, {"end": 241.08, "start": 240.56, "text": "compute"}, {"end": 241.24, "start": 241.08, "text": "the"}, {"end": 241.84, "start": 241.24, "text": "precision"}, {"end": 242.52, "start": 241.84, "text": "and"}, {"end": 243.28, "start": 242.52, "text": "record"}, {"end": 243.68, "start": 243.28, "text": "just"}, {"end": 244.84, "start": 243.68, "text": "once."}, {"end": 245.52, "start": 244.84, "text": "Now"}, {"end": 246.08, "start": 245.52, "text": "in"}, {"end": 246.76, "start": 246.08, "text": "contrast"}, {"end": 246.88, "start": 246.76, "text": "in"}, {"end": 247.36, "start": 246.88, "text": "macro"}, {"end": 247.56, "start": 247.36, "text": "averaging"}, {"end": 247.68, "start": 247.56, "text": "we're"}, {"end": 247.96, "start": 247.68, "text": "going"}, {"end": 248.08, "start": 247.96, "text": "to"}, {"end": 248.28, "start": 248.08, "text": "do"}, {"end": 248.72, "start": 248.28, "text": "that"}, {"end": 248.96, "start": 248.72, "text": "for"}, {"end": 249.24, "start": 248.96, "text": "each"}, {"end": 249.92, "start": 249.24, "text": "category"}, {"end": 250.6, "start": 249.92, "text": "first"}, {"end": 251.04, "start": 250.6, "text": "and"}, {"end": 251.52, "start": 251.04, "text": "then"}, {"end": 253.12, "start": 251.52, "text": "aggregate"}, {"end": 253.4, "start": 253.12, "text": "over"}, {"end": 253.52, "start": 253.4, "text": "these"}, {"end": 254.16, "start": 253.52, "text": "categories"}, {"end": 254.36, "start": 254.16, "text": "or"}, {"end": 254.52, "start": 254.36, "text": "we"}, {"end": 254.52, "start": 254.52, "text": "do"}, {"end": 254.96, "start": 254.52, "text": "that"}, {"end": 255.56, "start": 254.96, "text": "for"}, {"end": 255.76, "start": 255.56, "text": "each"}, {"end": 256.16, "start": 255.76, "text": "document"}, {"end": 256.52, "start": 256.16, "text": "and"}, {"end": 256.76, "start": 256.52, "text": "then"}, {"end": 256.88, "start": 256.76, "text": "aggregate"}, {"end": 257.08, "start": 256.88, "text": "over"}, {"end": 257.92, "start": 257.08, "text": "all"}, {"end": 258.48, "start": 257.92, "text": "the"}, {"end": 259.24, "start": 258.48, "text": "documents."}, {"end": 259.56, "start": 259.24, "text": "But"}, {"end": 260.04, "start": 259.56, "text": "here"}, {"end": 260.12, "start": 260.04, "text": "we"}, {"end": 260.12, "start": 260.12, "text": "put"}, {"end": 260.28, "start": 260.12, "text": "them"}, {"end": 261.04, "start": 260.28, "text": "together."}, {"end": 261.36, "start": 261.04, "text": "Now"}, {"end": 261.68, "start": 261.36, "text": "this"}, {"end": 261.68, "start": 261.68, "text": "will"}, {"end": 261.84, "start": 261.68, "text": "be"}, {"end": 262.44, "start": 261.84, "text": "very"}, {"end": 262.68, "start": 262.44, "text": "similar"}, {"end": 262.68, "start": 262.68, "text": "to"}, {"end": 262.92, "start": 262.68, "text": "the"}, {"end": 263.56, "start": 262.92, "text": "classification"}, {"end": 264.04, "start": 263.56, "text": "accuracy"}, {"end": 264.4, "start": 264.04, "text": "that"}, {"end": 264.52, "start": 264.4, "text": "we"}, {"end": 265.2, "start": 264.52, "text": "introduced"}, {"end": 265.72, "start": 265.2, "text": "earlier."}, {"end": 265.92, "start": 265.72, "text": "And"}, {"end": 266.12, "start": 265.92, "text": "one"}, {"end": 267.04, "start": 266.12, "text": "problem"}, {"end": 267.28, "start": 267.04, "text": "here"}, {"end": 267.36, "start": 267.28, "text": "of"}, {"end": 267.72, "start": 267.36, "text": "course"}, {"end": 268.16, "start": 267.72, "text": "is"}, {"end": 268.16, "start": 268.16, "text": "to"}, {"end": 268.72, "start": 268.16, "text": "treat"}, {"end": 268.96, "start": 268.72, "text": "all"}, {"end": 269.08, "start": 268.96, "text": "the"}, {"end": 269.96, "start": 269.08, "text": "instances,"}], "text": " In the table and then we can compute the precision and record just once. Now in contrast in macro averaging we're going to do that for each category first and then aggregate over these categories or we do that for each document and then aggregate over all the documents. But here we put them together. Now this will be very similar to the classification accuracy that we introduced earlier. And one problem here of course is to treat all the instances,"}, {"chunks": [{"end": 270.2, "start": 270.0, "text": "the"}, {"end": 270.92, "start": 270.2, "text": "decisions"}, {"end": 272.32, "start": 270.92, "text": "equally"}, {"end": 272.84, "start": 272.32, "text": "and"}, {"end": 273.48, "start": 272.84, "text": "this"}, {"end": 274.16, "start": 273.48, "text": "may"}, {"end": 274.4, "start": 274.16, "text": "not"}, {"end": 274.44, "start": 274.4, "text": "be"}, {"end": 275.2, "start": 274.44, "text": "desirable"}, {"end": 275.48, "start": 275.2, "text": "but"}, {"end": 275.48, "start": 275.48, "text": "it"}, {"end": 275.96, "start": 275.48, "text": "may"}, {"end": 276.08, "start": 275.96, "text": "be"}, {"end": 276.24, "start": 276.08, "text": "a"}, {"end": 277.04, "start": 276.24, "text": "property"}, {"end": 277.6, "start": 277.04, "text": "for"}, {"end": 277.92, "start": 277.6, "text": "some"}, {"end": 279.08, "start": 277.92, "text": "applications"}, {"end": 279.72, "start": 279.08, "text": "especially"}, {"end": 279.8, "start": 279.72, "text": "if"}, {"end": 279.96, "start": 279.8, "text": "we"}, {"end": 280.84, "start": 279.96, "text": "associate"}, {"end": 281.08, "start": 280.84, "text": "the"}, {"end": 281.84, "start": 281.08, "text": "for"}, {"end": 282.12, "start": 281.84, "text": "example"}, {"end": 282.36, "start": 282.12, "text": "the"}, {"end": 283.0, "start": 282.36, "text": "cost"}, {"end": 283.28, "start": 283.0, "text": "for"}, {"end": 283.96, "start": 283.28, "text": "each"}, {"end": 284.84, "start": 283.96, "text": "combination"}, {"end": 285.12, "start": 284.84, "text": "then"}, {"end": 285.48, "start": 285.12, "text": "we"}, {"end": 286.4, "start": 285.48, "text": "can"}, {"end": 287.16, "start": 286.4, "text": "actually"}, {"end": 287.2, "start": 287.16, "text": "compute"}, {"end": 287.4, "start": 287.2, "text": "the"}, {"end": 287.76, "start": 287.4, "text": "for"}, {"end": 288.2, "start": 287.76, "text": "example"}, {"end": 288.48, "start": 288.2, "text": "way"}, {"end": 288.52, "start": 288.48, "text": "to"}, {"end": 288.68, "start": 288.52, "text": "the"}, {"end": 289.52, "start": 288.68, "text": "classification"}, {"end": 290.08, "start": 289.52, "text": "accuracy"}, {"end": 290.6, "start": 290.08, "text": "where"}, {"end": 290.8, "start": 290.6, "text": "you"}, {"end": 291.2, "start": 290.8, "text": "associate"}, {"end": 291.32, "start": 291.2, "text": "the"}, {"end": 291.84, "start": 291.32, "text": "different"}, {"end": 292.28, "start": 291.84, "text": "cost"}, {"end": 292.56, "start": 292.28, "text": "or"}, {"end": 293.32, "start": 292.56, "text": "utility"}, {"end": 293.6, "start": 293.32, "text": "for"}, {"end": 294.12, "start": 293.6, "text": "each"}, {"end": 295.4, "start": 294.12, "text": "specific"}, {"end": 296.16, "start": 295.4, "text": "decision"}, {"end": 296.36, "start": 296.16, "text": "so"}, {"end": 296.8, "start": 296.36, "text": "there"}, {"end": 296.8, "start": 296.8, "text": "could"}, {"end": 297.08, "start": 296.8, "text": "be"}, {"end": 297.72, "start": 297.08, "text": "variations"}, {"end": 297.8, "start": 297.72, "text": "of"}, {"end": 298.04, "start": 297.8, "text": "these"}, {"end": 298.4, "start": 298.04, "text": "methods"}, {"end": 298.76, "start": 298.4, "text": "that"}, {"end": 298.76, "start": 298.76, "text": "would"}, {"end": 298.92, "start": 298.76, "text": "be"}, {"end": 299.16, "start": 298.92, "text": "more"}, {"end": 299.6, "start": 299.16, "text": "useful"}, {"end": 299.6, "start": 299.6, "text": "but"}, {"end": 299.6, "start": 299.6, "text": "in"}, {"end": 299.96, "start": 299.6, "text": "general"}], "text": " the decisions equally and this may not be desirable but it may be a property for some applications especially if we associate the for example the cost for each combination then we can actually compute the for example way to the classification accuracy where you associate the different cost or utility for each specific decision so there could be variations of these methods that would be more useful but in general"}, {"chunks": [{"end": 301.04, "start": 300.0, "text": "general"}, {"end": 301.96, "start": 301.04, "text": "macro"}, {"end": 302.56, "start": 301.96, "text": "average"}, {"end": 303.08, "start": 302.56, "text": "tends"}, {"end": 303.24, "start": 303.08, "text": "to"}, {"end": 303.32, "start": 303.24, "text": "be"}, {"end": 303.56, "start": 303.32, "text": "more"}, {"end": 304.72, "start": 303.56, "text": "informative"}, {"end": 305.52, "start": 304.72, "text": "than"}, {"end": 306.04, "start": 305.52, "text": "micro"}, {"end": 307.48, "start": 306.04, "text": "averaging"}, {"end": 307.8, "start": 307.48, "text": "just"}, {"end": 309.0, "start": 307.8, "text": "because"}, {"end": 309.8, "start": 309.0, "text": "it"}, {"end": 310.2, "start": 309.8, "text": "might"}, {"end": 311.52, "start": 310.2, "text": "reflect"}, {"end": 312.0, "start": 311.52, "text": "the"}, {"end": 312.4, "start": 312.0, "text": "need"}, {"end": 312.64, "start": 312.4, "text": "for"}, {"end": 313.2, "start": 312.64, "text": "understanding"}, {"end": 314.36, "start": 313.2, "text": "performance"}, {"end": 314.76, "start": 314.36, "text": "on"}, {"end": 314.76, "start": 314.76, "text": "each"}, {"end": 315.52, "start": 314.76, "text": "category"}, {"end": 315.8, "start": 315.52, "text": "or"}, {"end": 316.28, "start": 315.8, "text": "performance"}, {"end": 316.36, "start": 316.28, "text": "on"}, {"end": 316.56, "start": 316.36, "text": "each"}, {"end": 317.32, "start": 316.56, "text": "document"}, {"end": 317.72, "start": 317.32, "text": "which"}, {"end": 317.96, "start": 317.72, "text": "are"}, {"end": 318.24, "start": 317.96, "text": "needed"}, {"end": 318.8, "start": 318.24, "text": "in"}, {"end": 319.08, "start": 318.8, "text": "many"}, {"end": 320.56, "start": 319.08, "text": "applications"}, {"end": 321.76, "start": 320.56, "text": "but"}, {"end": 322.2, "start": 321.76, "text": "macro"}, {"end": 323.16, "start": 322.2, "text": "averaging"}, {"end": 323.8, "start": 323.16, "text": "and"}, {"end": 324.28, "start": 323.8, "text": "micro"}, {"end": 325.0, "start": 324.28, "text": "averaging"}, {"end": 325.52, "start": 325.0, "text": "they"}, {"end": 325.68, "start": 325.52, "text": "are"}, {"end": 326.08, "start": 325.68, "text": "both"}, {"end": 326.32, "start": 326.08, "text": "very"}, {"end": 327.0, "start": 326.32, "text": "common"}, {"end": 327.32, "start": 327.0, "text": "and"}, {"end": 327.76, "start": 327.32, "text": "you"}, {"end": 328.2, "start": 327.76, "text": "might"}, {"end": 328.36, "start": 328.2, "text": "see"}, {"end": 329.96, "start": 328.36, "text": "both"}], "text": " general macro average tends to be more informative than micro averaging just because it might reflect the need for understanding performance on each category or performance on each document which are needed in many applications but macro averaging and micro averaging they are both very common and you might see both"}, {"chunks": [{"end": 330.56, "start": 330.0, "text": "in"}, {"end": 331.44, "start": 330.56, "text": "research"}, {"end": 331.76, "start": 331.44, "text": "papers"}, {"end": 331.76, "start": 331.76, "text": "on"}, {"end": 331.96, "start": 331.76, "text": "textual"}, {"end": 332.72, "start": 331.96, "text": "categorization."}, {"end": 333.08, "start": 332.72, "text": "Also,"}, {"end": 333.32, "start": 333.08, "text": "sometimes"}, {"end": 334.36, "start": 333.32, "text": "categorization"}, {"end": 334.88, "start": 334.36, "text": "results"}, {"end": 335.32, "start": 334.88, "text": "might"}, {"end": 336.72, "start": 335.32, "text": "actually"}, {"end": 337.16, "start": 336.72, "text": "be"}, {"end": 337.96, "start": 337.16, "text": "evaluated"}, {"end": 338.2, "start": 337.96, "text": "from"}, {"end": 338.6, "start": 338.2, "text": "ranking"}, {"end": 340.36, "start": 338.6, "text": "perspective."}, {"end": 340.88, "start": 340.36, "text": "This"}, {"end": 341.2, "start": 340.88, "text": "is"}, {"end": 341.68, "start": 341.2, "text": "because"}, {"end": 342.4, "start": 341.68, "text": "categorization"}, {"end": 342.84, "start": 342.4, "text": "results"}, {"end": 342.96, "start": 342.84, "text": "are"}, {"end": 343.84, "start": 342.96, "text": "sometimes"}, {"end": 344.0, "start": 343.84, "text": "or"}, {"end": 344.48, "start": 344.0, "text": "often"}, {"end": 345.32, "start": 344.48, "text": "indeed"}, {"end": 346.12, "start": 345.32, "text": "passed"}, {"end": 346.6, "start": 346.12, "text": "to"}, {"end": 347.04, "start": 346.6, "text": "a"}, {"end": 347.8, "start": 347.04, "text": "human"}, {"end": 348.4, "start": 347.8, "text": "for"}, {"end": 349.12, "start": 348.4, "text": "various"}, {"end": 349.72, "start": 349.12, "text": "purposes."}, {"end": 350.76, "start": 349.72, "text": "For"}, {"end": 351.68, "start": 350.76, "text": "example,"}, {"end": 352.24, "start": 351.68, "text": "it"}, {"end": 352.8, "start": 352.24, "text": "might"}, {"end": 352.84, "start": 352.8, "text": "be"}, {"end": 353.04, "start": 352.84, "text": "passed"}, {"end": 353.24, "start": 353.04, "text": "to"}, {"end": 353.72, "start": 353.24, "text": "humans"}, {"end": 354.08, "start": 353.72, "text": "for"}, {"end": 354.12, "start": 354.08, "text": "further"}, {"end": 354.24, "start": 354.12, "text": "editing."}, {"end": 354.8, "start": 354.24, "text": "For"}, {"end": 355.0, "start": 354.8, "text": "example,"}, {"end": 355.56, "start": 355.0, "text": "news"}, {"end": 356.36, "start": 355.56, "text": "articles"}, {"end": 356.44, "start": 356.36, "text": "can"}, {"end": 356.48, "start": 356.44, "text": "be"}, {"end": 357.04, "start": 356.48, "text": "tentatively"}, {"end": 357.6, "start": 357.04, "text": "categorized"}, {"end": 357.76, "start": 357.6, "text": "by"}, {"end": 357.76, "start": 357.76, "text": "using"}, {"end": 358.04, "start": 357.76, "text": "a"}, {"end": 358.8, "start": 358.04, "text": "system"}, {"end": 359.08, "start": 358.8, "text": "and"}, {"end": 359.44, "start": 359.08, "text": "then"}, {"end": 359.44, "start": 359.44, "text": "human"}, {"end": 359.84, "start": 359.44, "text": "editors"}, {"end": 359.88, "start": 359.84, "text": "would"}, {"end": 359.96, "start": 359.88, "text": "then"}], "text": " in research papers on textual categorization. Also, sometimes categorization results might actually be evaluated from ranking perspective. This is because categorization results are sometimes or often indeed passed to a human for various purposes. For example, it might be passed to humans for further editing. For example, news articles can be tentatively categorized by using a system and then human editors would then"}, {"chunks": [{"end": 361.48, "start": 360.0, "text": "correct"}, {"end": 361.96, "start": 361.48, "text": "them."}, {"end": 362.12, "start": 361.96, "text": "Or"}, {"end": 362.28, "start": 362.12, "text": "the"}, {"end": 362.6, "start": 362.28, "text": "email"}, {"end": 363.8, "start": 362.6, "text": "messages"}, {"end": 364.44, "start": 363.8, "text": "might"}, {"end": 364.6, "start": 364.44, "text": "be"}, {"end": 365.36, "start": 364.6, "text": "routed"}, {"end": 365.72, "start": 365.36, "text": "to"}, {"end": 366.48, "start": 365.72, "text": "the"}, {"end": 366.72, "start": 366.48, "text": "right"}, {"end": 367.24, "start": 366.72, "text": "person"}, {"end": 367.68, "start": 367.24, "text": "for"}, {"end": 367.92, "start": 367.68, "text": "handling"}, {"end": 368.12, "start": 367.92, "text": "in"}, {"end": 368.52, "start": 368.12, "text": "the"}, {"end": 368.84, "start": 368.52, "text": "help"}, {"end": 369.56, "start": 368.84, "text": "desk."}, {"end": 370.04, "start": 369.56, "text": "In"}, {"end": 370.28, "start": 370.04, "text": "such"}, {"end": 370.36, "start": 370.28, "text": "a"}, {"end": 371.0, "start": 370.36, "text": "case,"}, {"end": 371.4, "start": 371.0, "text": "the"}, {"end": 371.96, "start": 371.4, "text": "categorization"}, {"end": 372.12, "start": 371.96, "text": "is"}, {"end": 372.36, "start": 372.12, "text": "to"}, {"end": 372.76, "start": 372.36, "text": "help"}, {"end": 374.0, "start": 372.76, "text": "prioritizing"}, {"end": 374.44, "start": 374.0, "text": "the"}, {"end": 375.76, "start": 374.44, "text": "task"}, {"end": 376.28, "start": 375.76, "text": "for"}, {"end": 376.28, "start": 376.28, "text": "a"}, {"end": 377.2, "start": 376.28, "text": "particular"}, {"end": 377.92, "start": 377.2, "text": "customer"}, {"end": 378.8, "start": 377.92, "text": "service"}, {"end": 379.6, "start": 378.8, "text": "person."}, {"end": 380.16, "start": 379.6, "text": "So"}, {"end": 380.72, "start": 380.16, "text": "in"}, {"end": 381.12, "start": 380.72, "text": "this"}, {"end": 382.24, "start": 381.12, "text": "case,"}, {"end": 383.08, "start": 382.24, "text": "the"}, {"end": 383.64, "start": 383.08, "text": "results"}, {"end": 384.0, "start": 383.64, "text": "have"}, {"end": 384.6, "start": 384.0, "text": "to"}, {"end": 384.72, "start": 384.6, "text": "be"}, {"end": 385.68, "start": 384.72, "text": "prioritized."}, {"end": 385.96, "start": 385.68, "text": "And"}, {"end": 386.48, "start": 385.96, "text": "if"}, {"end": 387.52, "start": 386.48, "text": "the"}, {"end": 388.08, "start": 387.52, "text": "system"}, {"end": 388.44, "start": 388.08, "text": "can"}, {"end": 389.04, "start": 388.44, "text": "give"}, {"end": 389.48, "start": 389.04, "text": "a"}, {"end": 389.96, "start": 389.48, "text": "score"}], "text": " correct them. Or the email messages might be routed to the right person for handling in the help desk. In such a case, the categorization is to help prioritizing the task for a particular customer service person. So in this case, the results have to be prioritized. And if the system can give a score"}, {"chunks": [{"end": 390.32, "start": 390.0, "text": "to"}, {"end": 390.52, "start": 390.32, "text": "the"}, {"end": 391.36, "start": 390.52, "text": "categorization"}, {"end": 392.28, "start": 391.36, "text": "decision"}, {"end": 393.04, "start": 392.28, "text": "or"}, {"end": 394.56, "start": 393.04, "text": "confidence,"}, {"end": 394.96, "start": 394.56, "text": "then"}, {"end": 395.2, "start": 394.96, "text": "we"}, {"end": 395.6, "start": 395.2, "text": "can"}, {"end": 396.0, "start": 395.6, "text": "use"}, {"end": 396.24, "start": 396.0, "text": "the"}, {"end": 397.08, "start": 396.24, "text": "scores"}, {"end": 397.72, "start": 397.08, "text": "to"}, {"end": 398.08, "start": 397.72, "text": "rank"}, {"end": 398.2, "start": 398.08, "text": "these"}, {"end": 399.16, "start": 398.2, "text": "decisions"}, {"end": 399.52, "start": 399.16, "text": "and"}, {"end": 399.76, "start": 399.52, "text": "then"}, {"end": 400.32, "start": 399.76, "text": "evaluate"}, {"end": 400.44, "start": 400.32, "text": "the"}, {"end": 401.16, "start": 400.44, "text": "results"}, {"end": 401.64, "start": 401.16, "text": "as"}, {"end": 401.76, "start": 401.64, "text": "a"}, {"end": 401.88, "start": 401.76, "text": "ranked"}, {"end": 402.4, "start": 401.88, "text": "list,"}, {"end": 402.76, "start": 402.4, "text": "just"}, {"end": 403.48, "start": 402.76, "text": "as"}, {"end": 403.96, "start": 403.48, "text": "in"}, {"end": 404.48, "start": 403.96, "text": "search"}, {"end": 404.6, "start": 404.48, "text": "engine"}, {"end": 405.28, "start": 404.6, "text": "evaluation,"}, {"end": 405.44, "start": 405.28, "text": "where"}, {"end": 405.68, "start": 405.44, "text": "you"}, {"end": 406.2, "start": 405.68, "text": "rank"}, {"end": 406.4, "start": 406.2, "text": "the"}, {"end": 407.12, "start": 406.4, "text": "documents"}, {"end": 407.16, "start": 407.12, "text": "in"}, {"end": 407.64, "start": 407.16, "text": "response"}, {"end": 408.24, "start": 407.64, "text": "to"}, {"end": 408.96, "start": 408.24, "text": "query."}, {"end": 409.36, "start": 408.96, "text": "So"}, {"end": 409.96, "start": 409.36, "text": "for"}, {"end": 411.16, "start": 409.96, "text": "example,"}, {"end": 411.72, "start": 411.16, "text": "a"}, {"end": 412.4, "start": 411.72, "text": "discovery"}, {"end": 412.52, "start": 412.4, "text": "of"}, {"end": 413.16, "start": 412.52, "text": "spam"}, {"end": 413.28, "start": 413.16, "text": "emails"}, {"end": 414.16, "start": 413.28, "text": "can"}, {"end": 414.44, "start": 414.16, "text": "be"}, {"end": 415.72, "start": 414.44, "text": "evaluated"}, {"end": 416.32, "start": 415.72, "text": "based"}, {"end": 417.0, "start": 416.32, "text": "on"}, {"end": 417.72, "start": 417.0, "text": "ranking"}, {"end": 418.44, "start": 417.72, "text": "emails"}, {"end": 418.92, "start": 418.44, "text": "for"}, {"end": 419.24, "start": 418.92, "text": "the"}, {"end": 419.48, "start": 419.24, "text": "spam"}, {"end": 419.96, "start": 419.48, "text": "category."}], "text": " to the categorization decision or confidence, then we can use the scores to rank these decisions and then evaluate the results as a ranked list, just as in search engine evaluation, where you rank the documents in response to query. So for example, a discovery of spam emails can be evaluated based on ranking emails for the spam category."}, {"chunks": [{"end": 420.08, "start": 420.0, "text": "This"}, {"end": 420.68, "start": 420.08, "text": "is"}, {"end": 421.4, "start": 420.68, "text": "useful"}, {"end": 421.68, "start": 421.4, "text": "if"}, {"end": 421.84, "start": 421.68, "text": "you"}, {"end": 422.44, "start": 421.84, "text": "want"}, {"end": 422.8, "start": 422.44, "text": "people"}, {"end": 423.04, "start": 422.8, "text": "to"}, {"end": 423.6, "start": 423.04, "text": "verify"}, {"end": 423.84, "start": 423.6, "text": "whether"}, {"end": 424.48, "start": 423.84, "text": "this"}, {"end": 424.8, "start": 424.48, "text": "is"}, {"end": 425.16, "start": 424.8, "text": "really"}, {"end": 425.4, "start": 425.16, "text": "a"}, {"end": 425.68, "start": 425.4, "text": "spam."}, {"end": 425.92, "start": 425.68, "text": "The"}, {"end": 426.76, "start": 425.92, "text": "person"}, {"end": 426.96, "start": 426.76, "text": "would"}, {"end": 427.28, "start": 426.96, "text": "then"}, {"end": 427.48, "start": 427.28, "text": "take"}, {"end": 427.88, "start": 427.48, "text": "the"}, {"end": 428.24, "start": 427.88, "text": "ranked"}, {"end": 428.36, "start": 428.24, "text": "list"}, {"end": 428.4, "start": 428.36, "text": "to"}, {"end": 428.76, "start": 428.4, "text": "check"}, {"end": 429.32, "start": 428.76, "text": "one"}, {"end": 430.12, "start": 429.32, "text": "by"}, {"end": 430.76, "start": 430.12, "text": "one"}, {"end": 431.16, "start": 430.76, "text": "and"}, {"end": 431.64, "start": 431.16, "text": "then"}, {"end": 432.08, "start": 431.64, "text": "verify"}, {"end": 432.8, "start": 432.08, "text": "whether"}, {"end": 433.44, "start": 432.8, "text": "this"}, {"end": 433.96, "start": 433.44, "text": "is"}, {"end": 434.0, "start": 433.96, "text": "indeed"}, {"end": 434.0, "start": 434.0, "text": "a"}, {"end": 434.0, "start": 434.0, "text": "spam."}, {"end": 434.0, "start": 434.0, "text": "So"}, {"end": 434.12, "start": 434.0, "text": "to"}, {"end": 435.2, "start": 434.12, "text": "reflect"}, {"end": 435.8, "start": 435.2, "text": "the"}, {"end": 436.32, "start": 435.8, "text": "utility"}, {"end": 436.68, "start": 436.32, "text": "for"}, {"end": 437.4, "start": 436.68, "text": "humans"}, {"end": 437.8, "start": 437.4, "text": "in"}, {"end": 438.36, "start": 437.8, "text": "such"}, {"end": 438.72, "start": 438.36, "text": "a"}, {"end": 438.8, "start": 438.72, "text": "task,"}, {"end": 439.24, "start": 438.8, "text": "it's"}, {"end": 439.6, "start": 439.24, "text": "better"}, {"end": 439.72, "start": 439.6, "text": "to"}, {"end": 440.16, "start": 439.72, "text": "evaluate"}, {"end": 440.4, "start": 440.16, "text": "the"}, {"end": 440.84, "start": 440.4, "text": "ranking"}, {"end": 441.36, "start": 440.84, "text": "accuracy."}, {"end": 441.88, "start": 441.36, "text": "This"}, {"end": 442.24, "start": 441.88, "text": "is"}, {"end": 442.96, "start": 442.24, "text": "basically"}, {"end": 443.12, "start": 442.96, "text": "similar"}, {"end": 443.16, "start": 443.12, "text": "to"}, {"end": 443.72, "start": 443.16, "text": "search."}, {"end": 443.88, "start": 443.72, "text": "In"}, {"end": 444.36, "start": 443.88, "text": "such"}, {"end": 444.88, "start": 444.36, "text": "a"}, {"end": 445.44, "start": 444.88, "text": "case,"}, {"end": 445.92, "start": 445.44, "text": "often"}, {"end": 446.48, "start": 445.92, "text": "the"}, {"end": 447.08, "start": 446.48, "text": "problem"}, {"end": 447.56, "start": 447.08, "text": "can"}, {"end": 447.56, "start": 447.56, "text": "be"}, {"end": 448.0, "start": 447.56, "text": "better"}, {"end": 448.52, "start": 448.0, "text": "formulated"}, {"end": 448.92, "start": 448.52, "text": "as"}, {"end": 449.0, "start": 448.92, "text": "a"}, {"end": 449.0, "start": 449.0, "text": "ranking"}, {"end": 449.0, "start": 449.0, "text": "problem"}, {"end": 449.28, "start": 449.0, "text": "instead"}, {"end": 449.96, "start": 449.28, "text": "of"}], "text": " This is useful if you want people to verify whether this is really a spam. The person would then take the ranked list to check one by one and then verify whether this is indeed a spam. So to reflect the utility for humans in such a task, it's better to evaluate the ranking accuracy. This is basically similar to search. In such a case, often the problem can be better formulated as a ranking problem instead of"}, {"chunks": [{"end": 450.84, "start": 450.0, "text": "categorization"}, {"end": 451.24, "start": 450.84, "text": "problem."}, {"end": 451.68, "start": 451.24, "text": "For"}, {"end": 452.44, "start": 451.68, "text": "example,"}, {"end": 453.04, "start": 452.44, "text": "ranking"}, {"end": 453.52, "start": 453.04, "text": "documents"}, {"end": 453.52, "start": 453.52, "text": "in"}, {"end": 453.6, "start": 453.52, "text": "the"}, {"end": 454.04, "start": 453.6, "text": "search"}, {"end": 454.32, "start": 454.04, "text": "engine"}, {"end": 454.32, "start": 454.32, "text": "can"}, {"end": 454.32, "start": 454.32, "text": "also"}, {"end": 454.32, "start": 454.32, "text": "be"}, {"end": 454.36, "start": 454.32, "text": "framed"}, {"end": 454.56, "start": 454.36, "text": "as"}, {"end": 454.64, "start": 454.56, "text": "a"}, {"end": 454.64, "start": 454.64, "text": "binary"}, {"end": 455.96, "start": 454.64, "text": "categorization"}, {"end": 456.48, "start": 455.96, "text": "problem."}, {"end": 457.28, "start": 456.48, "text": "Distinguish"}, {"end": 457.56, "start": 457.28, "text": "relevant"}, {"end": 458.16, "start": 457.56, "text": "documents"}, {"end": 458.68, "start": 458.16, "text": "that"}, {"end": 458.96, "start": 458.68, "text": "are"}, {"end": 459.52, "start": 458.96, "text": "useful"}, {"end": 459.8, "start": 459.52, "text": "to"}, {"end": 460.56, "start": 459.8, "text": "users"}, {"end": 460.8, "start": 460.56, "text": "from"}, {"end": 461.2, "start": 460.8, "text": "those"}, {"end": 461.4, "start": 461.2, "text": "that"}, {"end": 461.56, "start": 461.4, "text": "are"}, {"end": 461.6, "start": 461.56, "text": "not"}, {"end": 462.04, "start": 461.6, "text": "useful."}, {"end": 462.12, "start": 462.04, "text": "But"}, {"end": 462.52, "start": 462.12, "text": "typically"}, {"end": 462.52, "start": 462.52, "text": "we"}, {"end": 462.52, "start": 462.52, "text": "frame"}, {"end": 462.6, "start": 462.52, "text": "this"}, {"end": 463.04, "start": 462.6, "text": "as"}, {"end": 463.48, "start": 463.04, "text": "a"}, {"end": 464.52, "start": 463.48, "text": "ranking"}, {"end": 465.52, "start": 464.52, "text": "problem"}, {"end": 466.12, "start": 465.52, "text": "and"}, {"end": 466.4, "start": 466.12, "text": "we"}, {"end": 466.96, "start": 466.4, "text": "evaluate"}, {"end": 467.36, "start": 466.96, "text": "it"}, {"end": 468.12, "start": 467.36, "text": "as"}, {"end": 468.6, "start": 468.12, "text": "a"}, {"end": 469.48, "start": 468.6, "text": "ranking"}, {"end": 469.72, "start": 469.48, "text": "list."}, {"end": 470.36, "start": 469.72, "text": "That's"}, {"end": 470.72, "start": 470.36, "text": "because"}, {"end": 470.72, "start": 470.72, "text": "people"}, {"end": 470.72, "start": 470.72, "text": "tend"}, {"end": 470.76, "start": 470.72, "text": "to"}, {"end": 471.24, "start": 470.76, "text": "examine"}, {"end": 471.4, "start": 471.24, "text": "the"}, {"end": 472.0, "start": 471.4, "text": "results"}, {"end": 472.6, "start": 472.0, "text": "sequentially."}, {"end": 472.6, "start": 472.6, "text": "Ranking"}, {"end": 473.16, "start": 472.6, "text": "evaluation"}, {"end": 473.52, "start": 473.16, "text": "more"}, {"end": 474.08, "start": 473.52, "text": "reflects"}, {"end": 474.12, "start": 474.08, "text": "the"}, {"end": 475.12, "start": 474.12, "text": "utility"}, {"end": 475.48, "start": 475.12, "text": "from"}, {"end": 475.6, "start": 475.48, "text": "a"}, {"end": 476.4, "start": 475.6, "text": "user's"}, {"end": 477.28, "start": 476.4, "text": "perspective."}, {"end": 478.92, "start": 477.28, "text": "To"}, {"end": 479.96, "start": 478.92, "text": "summarize"}], "text": " categorization problem. For example, ranking documents in the search engine can also be framed as a binary categorization problem. Distinguish relevant documents that are useful to users from those that are not useful. But typically we frame this as a ranking problem and we evaluate it as a ranking list. That's because people tend to examine the results sequentially. Ranking evaluation more reflects the utility from a user's perspective. To summarize"}, {"chunks": [{"end": 481.04, "start": 480.0, "text": "Categorization"}, {"end": 481.68, "start": 481.04, "text": "evaluation."}, {"end": 481.84, "start": 481.68, "text": "First,"}, {"end": 482.88, "start": 481.84, "text": "evaluation"}, {"end": 483.28, "start": 482.88, "text": "is"}, {"end": 483.52, "start": 483.28, "text": "always"}, {"end": 483.84, "start": 483.52, "text": "very"}, {"end": 484.36, "start": 483.84, "text": "important"}, {"end": 484.48, "start": 484.36, "text": "for"}, {"end": 484.76, "start": 484.48, "text": "all"}, {"end": 485.68, "start": 484.76, "text": "these"}, {"end": 486.12, "start": 485.68, "text": "tasks,"}, {"end": 486.32, "start": 486.12, "text": "so"}, {"end": 486.32, "start": 486.32, "text": "get"}, {"end": 486.32, "start": 486.32, "text": "it"}, {"end": 486.32, "start": 486.32, "text": "right."}, {"end": 486.32, "start": 486.32, "text": "If"}, {"end": 486.32, "start": 486.32, "text": "you"}, {"end": 486.32, "start": 486.32, "text": "don't"}, {"end": 486.68, "start": 486.32, "text": "get"}, {"end": 487.0, "start": 486.68, "text": "it"}, {"end": 487.52, "start": 487.0, "text": "right,"}, {"end": 488.08, "start": 487.52, "text": "you"}, {"end": 488.72, "start": 488.08, "text": "might"}, {"end": 489.08, "start": 488.72, "text": "get"}, {"end": 489.52, "start": 489.08, "text": "misleading"}, {"end": 490.48, "start": 489.52, "text": "results"}, {"end": 490.68, "start": 490.48, "text": "and"}, {"end": 490.68, "start": 490.68, "text": "you"}, {"end": 490.8, "start": 490.68, "text": "might"}, {"end": 490.84, "start": 490.8, "text": "be"}, {"end": 491.44, "start": 490.84, "text": "misled"}, {"end": 491.44, "start": 491.44, "text": "to"}, {"end": 491.64, "start": 491.44, "text": "believe"}, {"end": 492.24, "start": 491.64, "text": "one"}, {"end": 492.96, "start": 492.24, "text": "method"}, {"end": 493.12, "start": 492.96, "text": "is"}, {"end": 493.2, "start": 493.12, "text": "better"}, {"end": 493.2, "start": 493.2, "text": "than"}, {"end": 493.36, "start": 493.2, "text": "the"}, {"end": 493.8, "start": 493.36, "text": "other,"}, {"end": 494.04, "start": 493.8, "text": "which"}, {"end": 494.6, "start": 494.04, "text": "is"}, {"end": 494.68, "start": 494.6, "text": "in"}, {"end": 494.96, "start": 494.68, "text": "fact"}, {"end": 495.56, "start": 494.96, "text": "not"}, {"end": 495.72, "start": 495.56, "text": "true."}, {"end": 495.88, "start": 495.72, "text": "So"}, {"end": 496.08, "start": 495.88, "text": "it's"}, {"end": 496.48, "start": 496.08, "text": "very"}, {"end": 497.76, "start": 496.48, "text": "important"}, {"end": 498.0, "start": 497.76, "text": "to"}, {"end": 498.28, "start": 498.0, "text": "get"}, {"end": 498.56, "start": 498.28, "text": "it"}, {"end": 499.0, "start": 498.56, "text": "right."}, {"end": 499.56, "start": 499.0, "text": "Meshes"}, {"end": 499.8, "start": 499.56, "text": "must"}, {"end": 500.24, "start": 499.8, "text": "also"}, {"end": 500.68, "start": 500.24, "text": "reflect"}, {"end": 500.92, "start": 500.68, "text": "the"}, {"end": 501.08, "start": 500.92, "text": "intended"}, {"end": 501.44, "start": 501.08, "text": "use"}, {"end": 501.44, "start": 501.44, "text": "of"}, {"end": 501.68, "start": 501.44, "text": "the"}, {"end": 502.08, "start": 501.68, "text": "results"}, {"end": 502.36, "start": 502.08, "text": "for"}, {"end": 502.36, "start": 502.36, "text": "a"}, {"end": 502.68, "start": 502.36, "text": "particular"}, {"end": 503.28, "start": 502.68, "text": "application."}, {"end": 503.92, "start": 503.28, "text": "For"}, {"end": 504.48, "start": 503.92, "text": "example,"}, {"end": 504.68, "start": 504.48, "text": "in"}, {"end": 505.08, "start": 504.68, "text": "spam"}, {"end": 505.68, "start": 505.08, "text": "filtering"}, {"end": 505.76, "start": 505.68, "text": "and"}, {"end": 505.76, "start": 505.76, "text": "news"}, {"end": 506.68, "start": 505.76, "text": "categorization,"}, {"end": 506.8, "start": 506.68, "text": "the"}, {"end": 507.0, "start": 506.8, "text": "results"}, {"end": 507.44, "start": 507.0, "text": "are"}, {"end": 508.08, "start": 507.44, "text": "used"}, {"end": 508.48, "start": 508.08, "text": "in"}, {"end": 508.84, "start": 508.48, "text": "maybe"}, {"end": 509.16, "start": 508.84, "text": "different"}, {"end": 509.96, "start": 509.16, "text": "ways."}], "text": " Categorization evaluation. First, evaluation is always very important for all these tasks, so get it right. If you don't get it right, you might get misleading results and you might be misled to believe one method is better than the other, which is in fact not true. So it's very important to get it right. Meshes must also reflect the intended use of the results for a particular application. For example, in spam filtering and news categorization, the results are used in maybe different ways."}, {"chunks": [{"end": 510.6, "start": 510.0, "text": "So"}, {"end": 511.36, "start": 510.6, "text": "then"}, {"end": 511.6, "start": 511.36, "text": "we"}, {"end": 512.08, "start": 511.6, "text": "would"}, {"end": 512.4, "start": 512.08, "text": "need"}, {"end": 512.56, "start": 512.4, "text": "to"}, {"end": 513.04, "start": 512.56, "text": "consider"}, {"end": 513.08, "start": 513.04, "text": "the"}, {"end": 513.64, "start": 513.08, "text": "difference"}, {"end": 513.72, "start": 513.64, "text": "and"}, {"end": 514.32, "start": 513.72, "text": "design"}, {"end": 514.92, "start": 514.32, "text": "meshes"}, {"end": 515.44, "start": 514.92, "text": "appropriately."}, {"end": 515.8, "start": 515.44, "text": "We"}, {"end": 516.04, "start": 515.8, "text": "generally"}, {"end": 516.32, "start": 516.04, "text": "need"}, {"end": 516.48, "start": 516.32, "text": "to"}, {"end": 517.08, "start": 516.48, "text": "consider"}, {"end": 517.8, "start": 517.08, "text": "how"}, {"end": 518.0, "start": 517.8, "text": "will"}, {"end": 518.76, "start": 518.0, "text": "the"}, {"end": 519.48, "start": 518.76, "text": "results"}, {"end": 519.6, "start": 519.48, "text": "be"}, {"end": 519.92, "start": 519.6, "text": "further"}, {"end": 520.4, "start": 519.92, "text": "processed"}, {"end": 520.64, "start": 520.4, "text": "by"}, {"end": 520.72, "start": 520.64, "text": "a"}, {"end": 521.16, "start": 520.72, "text": "user"}, {"end": 521.36, "start": 521.16, "text": "and"}, {"end": 521.76, "start": 521.36, "text": "then"}, {"end": 521.96, "start": 521.76, "text": "think"}, {"end": 522.24, "start": 521.96, "text": "from"}, {"end": 522.32, "start": 522.24, "text": "a"}, {"end": 522.76, "start": 522.32, "text": "user's"}, {"end": 523.52, "start": 522.76, "text": "perspective"}, {"end": 524.24, "start": 523.52, "text": "what"}, {"end": 524.76, "start": 524.24, "text": "quality"}, {"end": 525.08, "start": 524.76, "text": "is"}, {"end": 525.72, "start": 525.08, "text": "important,"}, {"end": 526.2, "start": 525.72, "text": "what"}, {"end": 526.84, "start": 526.2, "text": "aspect"}, {"end": 526.96, "start": 526.84, "text": "of"}, {"end": 526.96, "start": 526.96, "text": "quality"}, {"end": 526.96, "start": 526.96, "text": "is"}, {"end": 527.56, "start": 526.96, "text": "important."}, {"end": 528.32, "start": 527.56, "text": "Sometimes"}, {"end": 528.96, "start": 528.32, "text": "there"}, {"end": 529.52, "start": 528.96, "text": "are"}, {"end": 530.56, "start": 529.52, "text": "trade-offs"}, {"end": 530.84, "start": 530.56, "text": "between"}, {"end": 531.12, "start": 530.84, "text": "multiple"}, {"end": 531.72, "start": 531.12, "text": "aspects"}, {"end": 531.72, "start": 531.72, "text": "like"}, {"end": 532.48, "start": 531.72, "text": "precision"}, {"end": 533.44, "start": 532.48, "text": "and"}, {"end": 534.44, "start": 533.44, "text": "recall"}, {"end": 534.64, "start": 534.44, "text": "and"}, {"end": 535.0, "start": 534.64, "text": "so"}, {"end": 535.0, "start": 535.0, "text": "we"}, {"end": 535.0, "start": 535.0, "text": "need"}, {"end": 535.0, "start": 535.0, "text": "to"}, {"end": 535.0, "start": 535.0, "text": "know"}, {"end": 535.0, "start": 535.0, "text": "for"}, {"end": 535.44, "start": 535.0, "text": "this"}, {"end": 535.76, "start": 535.44, "text": "application"}, {"end": 535.84, "start": 535.76, "text": "is"}, {"end": 536.16, "start": 535.84, "text": "high"}, {"end": 536.68, "start": 536.16, "text": "recall"}, {"end": 537.16, "start": 536.68, "text": "more"}, {"end": 537.24, "start": 537.16, "text": "important"}, {"end": 537.44, "start": 537.24, "text": "or"}, {"end": 537.84, "start": 537.44, "text": "high"}, {"end": 538.56, "start": 537.84, "text": "precision"}, {"end": 538.56, "start": 538.56, "text": "is"}, {"end": 538.92, "start": 538.56, "text": "more"}, {"end": 539.96, "start": 538.92, "text": "important."}], "text": " So then we would need to consider the difference and design meshes appropriately. We generally need to consider how will the results be further processed by a user and then think from a user's perspective what quality is important, what aspect of quality is important. Sometimes there are trade-offs between multiple aspects like precision and recall and so we need to know for this application is high recall more important or high precision is more important."}, {"chunks": [{"end": 540.12, "start": 540.0, "text": "Really"}, {"end": 540.36, "start": 540.12, "text": "we"}, {"end": 540.8, "start": 540.36, "text": "associate"}, {"end": 540.96, "start": 540.8, "text": "the"}, {"end": 541.32, "start": 540.96, "text": "different"}, {"end": 541.68, "start": 541.32, "text": "cost"}, {"end": 542.04, "start": 541.68, "text": "with"}, {"end": 542.08, "start": 542.04, "text": "each"}, {"end": 542.44, "start": 542.08, "text": "different"}, {"end": 542.92, "start": 542.44, "text": "decision"}, {"end": 543.36, "start": 542.92, "text": "error."}, {"end": 543.6, "start": 543.36, "text": "And"}, {"end": 544.04, "start": 543.6, "text": "this"}, {"end": 544.12, "start": 544.04, "text": "of"}, {"end": 544.64, "start": 544.12, "text": "course"}, {"end": 544.8, "start": 544.64, "text": "has"}, {"end": 544.8, "start": 544.8, "text": "to"}, {"end": 544.84, "start": 544.8, "text": "be"}, {"end": 545.32, "start": 544.84, "text": "designed"}, {"end": 545.56, "start": 545.32, "text": "in"}, {"end": 545.6, "start": 545.56, "text": "an"}, {"end": 546.0, "start": 545.6, "text": "application"}, {"end": 546.56, "start": 546.0, "text": "specific"}, {"end": 547.12, "start": 546.56, "text": "way."}, {"end": 547.36, "start": 547.12, "text": "Some"}, {"end": 547.84, "start": 547.36, "text": "commonly"}, {"end": 548.6, "start": 547.84, "text": "used"}, {"end": 549.32, "start": 548.6, "text": "measures"}, {"end": 549.96, "start": 549.32, "text": "for"}, {"end": 550.44, "start": 549.96, "text": "relative"}, {"end": 551.0, "start": 550.44, "text": "comparison"}, {"end": 551.04, "start": 551.0, "text": "of"}, {"end": 551.4, "start": 551.04, "text": "different"}, {"end": 552.08, "start": 551.4, "text": "methods"}, {"end": 552.52, "start": 552.08, "text": "are"}, {"end": 552.84, "start": 552.52, "text": "the"}, {"end": 552.84, "start": 552.84, "text": "following."}, {"end": 553.48, "start": 552.84, "text": "Classification"}, {"end": 554.16, "start": 553.48, "text": "accuracy"}, {"end": 554.48, "start": 554.16, "text": "is"}, {"end": 554.72, "start": 554.48, "text": "very"}, {"end": 555.08, "start": 554.72, "text": "commonly"}, {"end": 555.76, "start": 555.08, "text": "used"}, {"end": 556.0, "start": 555.76, "text": "for"}, {"end": 556.68, "start": 556.0, "text": "especially"}, {"end": 557.48, "start": 556.68, "text": "balanced"}, {"end": 558.12, "start": 557.48, "text": "test"}, {"end": 558.72, "start": 558.12, "text": "set."}, {"end": 559.52, "start": 558.72, "text": "Precision"}, {"end": 559.96, "start": 559.52, "text": "recall"}, {"end": 560.08, "start": 559.96, "text": "and"}, {"end": 560.6, "start": 560.08, "text": "F"}, {"end": 561.6, "start": 560.6, "text": "scores"}, {"end": 562.04, "start": 561.6, "text": "are"}, {"end": 562.28, "start": 562.04, "text": "commonly"}, {"end": 562.72, "start": 562.28, "text": "reported"}, {"end": 562.88, "start": 562.72, "text": "to"}, {"end": 563.64, "start": 562.88, "text": "characterize"}, {"end": 563.64, "start": 563.64, "text": "the"}, {"end": 564.76, "start": 563.64, "text": "performances"}, {"end": 564.92, "start": 564.76, "text": "in"}, {"end": 565.44, "start": 564.92, "text": "different"}, {"end": 565.96, "start": 565.44, "text": "angles."}, {"end": 566.12, "start": 565.96, "text": "And"}, {"end": 566.24, "start": 566.12, "text": "there"}, {"end": 566.32, "start": 566.24, "text": "are"}, {"end": 566.8, "start": 566.32, "text": "some"}, {"end": 567.48, "start": 566.8, "text": "also"}, {"end": 568.4, "start": 567.48, "text": "variations"}, {"end": 568.84, "start": 568.4, "text": "like"}, {"end": 568.96, "start": 568.84, "text": "per"}, {"end": 569.44, "start": 568.96, "text": "document"}, {"end": 569.96, "start": 569.44, "text": "basically"}], "text": " Really we associate the different cost with each different decision error. And this of course has to be designed in an application specific way. Some commonly used measures for relative comparison of different methods are the following. Classification accuracy is very commonly used for especially balanced test set. Precision recall and F scores are commonly reported to characterize the performances in different angles. And there are some also variations like per document basically"}, {"chunks": [{"end": 570.24, "start": 570.0, "text": "per"}, {"end": 570.96, "start": 570.24, "text": "category"}, {"end": 571.52, "start": 570.96, "text": "averaging"}, {"end": 571.92, "start": 571.52, "text": "and"}, {"end": 572.16, "start": 571.92, "text": "then"}, {"end": 572.32, "start": 572.16, "text": "take"}, {"end": 572.68, "start": 572.32, "text": "an"}, {"end": 573.12, "start": 572.68, "text": "average"}, {"end": 573.36, "start": 573.12, "text": "of"}, {"end": 573.88, "start": 573.36, "text": "all"}, {"end": 574.16, "start": 573.88, "text": "of"}, {"end": 574.16, "start": 574.16, "text": "them"}, {"end": 574.2, "start": 574.16, "text": "in"}, {"end": 574.56, "start": 574.2, "text": "different"}, {"end": 575.24, "start": 574.56, "text": "ways"}, {"end": 575.64, "start": 575.24, "text": "micro"}, {"end": 576.24, "start": 575.64, "text": "versus"}, {"end": 576.72, "start": 576.24, "text": "macro"}, {"end": 577.36, "start": 576.72, "text": "averaging."}, {"end": 577.88, "start": 577.36, "text": "In"}, {"end": 578.56, "start": 577.88, "text": "general,"}, {"end": 578.8, "start": 578.56, "text": "you"}, {"end": 579.16, "start": 578.8, "text": "want"}, {"end": 579.24, "start": 579.16, "text": "to"}, {"end": 579.76, "start": 579.24, "text": "look"}, {"end": 580.04, "start": 579.76, "text": "at"}, {"end": 580.56, "start": 580.04, "text": "the"}, {"end": 580.96, "start": 580.56, "text": "results"}, {"end": 581.12, "start": 580.96, "text": "from"}, {"end": 581.52, "start": 581.12, "text": "multiple"}, {"end": 582.6, "start": 581.52, "text": "perspectives"}, {"end": 582.64, "start": 582.6, "text": "and"}, {"end": 582.96, "start": 582.64, "text": "for"}, {"end": 583.36, "start": 582.96, "text": "particular"}, {"end": 583.72, "start": 583.36, "text": "applications,"}, {"end": 583.96, "start": 583.72, "text": "some"}, {"end": 584.6, "start": 583.96, "text": "perspectives"}, {"end": 584.72, "start": 584.6, "text": "would"}, {"end": 584.84, "start": 584.72, "text": "be"}, {"end": 585.12, "start": 584.84, "text": "more"}, {"end": 585.16, "start": 585.12, "text": "important"}, {"end": 585.6, "start": 585.16, "text": "than"}, {"end": 586.36, "start": 585.6, "text": "others."}, {"end": 586.48, "start": 586.36, "text": "But"}, {"end": 587.0, "start": 586.48, "text": "for"}, {"end": 587.76, "start": 587.0, "text": "diagnosis"}, {"end": 588.4, "start": 587.76, "text": "analysis"}, {"end": 588.52, "start": 588.4, "text": "of"}, {"end": 589.12, "start": 588.52, "text": "categorization"}, {"end": 589.96, "start": 589.12, "text": "methods,"}, {"end": 590.4, "start": 589.96, "text": "it's"}, {"end": 590.76, "start": 590.4, "text": "generally"}, {"end": 591.28, "start": 590.76, "text": "useful"}, {"end": 591.72, "start": 591.28, "text": "to"}, {"end": 592.2, "start": 591.72, "text": "look"}, {"end": 592.28, "start": 592.2, "text": "at"}, {"end": 592.48, "start": 592.28, "text": "as"}, {"end": 592.92, "start": 592.48, "text": "many"}, {"end": 593.48, "start": 592.92, "text": "perspectives"}, {"end": 593.64, "start": 593.48, "text": "as"}, {"end": 594.4, "start": 593.64, "text": "possible"}, {"end": 595.0, "start": 594.4, "text": "to"}, {"end": 595.56, "start": 595.0, "text": "see"}, {"end": 596.04, "start": 595.56, "text": "subtle"}, {"end": 596.68, "start": 596.04, "text": "differences"}, {"end": 596.92, "start": 596.68, "text": "between"}, {"end": 597.44, "start": 596.92, "text": "methods"}, {"end": 597.6, "start": 597.44, "text": "or"}, {"end": 597.96, "start": 597.6, "text": "to"}, {"end": 598.04, "start": 597.96, "text": "see"}, {"end": 598.56, "start": 598.04, "text": "where"}, {"end": 598.6, "start": 598.56, "text": "a"}, {"end": 598.96, "start": 598.6, "text": "method"}, {"end": 599.56, "start": 598.96, "text": "might"}, {"end": 599.6, "start": 599.56, "text": "be"}, {"end": 599.96, "start": 599.6, "text": "weak."}], "text": " per category averaging and then take an average of all of them in different ways micro versus macro averaging. In general, you want to look at the results from multiple perspectives and for particular applications, some perspectives would be more important than others. But for diagnosis analysis of categorization methods, it's generally useful to look at as many perspectives as possible to see subtle differences between methods or to see where a method might be weak."}, {"chunks": [{"end": 600.64, "start": 600.0, "text": "from"}, {"end": 601.04, "start": 600.64, "text": "which"}, {"end": 601.2, "start": 601.04, "text": "you"}, {"end": 601.32, "start": 601.2, "text": "can"}, {"end": 601.64, "start": 601.32, "text": "obtain"}, {"end": 602.04, "start": 601.64, "text": "insights"}, {"end": 602.28, "start": 602.04, "text": "for"}, {"end": 602.96, "start": 602.28, "text": "improving"}, {"end": 603.52, "start": 602.96, "text": "the"}, {"end": 604.6, "start": 603.52, "text": "method."}, {"end": 604.92, "start": 604.6, "text": "Finally,"}, {"end": 605.6, "start": 604.92, "text": "sometimes"}, {"end": 606.0, "start": 605.6, "text": "ranking"}, {"end": 606.16, "start": 606.0, "text": "may"}, {"end": 606.28, "start": 606.16, "text": "be"}, {"end": 606.68, "start": 606.28, "text": "more"}, {"end": 607.12, "start": 606.68, "text": "appropriate,"}, {"end": 607.24, "start": 607.12, "text": "so"}, {"end": 607.36, "start": 607.24, "text": "be"}, {"end": 607.8, "start": 607.36, "text": "careful."}, {"end": 608.56, "start": 607.8, "text": "Sometimes"}, {"end": 609.16, "start": 608.56, "text": "categorization"}, {"end": 609.64, "start": 609.16, "text": "tasks"}, {"end": 609.92, "start": 609.64, "text": "may"}, {"end": 610.0, "start": 609.92, "text": "be"}, {"end": 610.64, "start": 610.0, "text": "better"}, {"end": 611.2, "start": 610.64, "text": "framed"}, {"end": 611.72, "start": 611.2, "text": "as"}, {"end": 611.8, "start": 611.72, "text": "a"}, {"end": 611.8, "start": 611.8, "text": "ranking"}, {"end": 612.04, "start": 611.8, "text": "task."}, {"end": 612.08, "start": 612.04, "text": "And"}, {"end": 612.36, "start": 612.08, "text": "there"}, {"end": 612.4, "start": 612.36, "text": "are"}, {"end": 612.72, "start": 612.4, "text": "machine"}, {"end": 612.96, "start": 612.72, "text": "learning"}, {"end": 613.84, "start": 612.96, "text": "methods"}, {"end": 614.32, "start": 613.84, "text": "for"}, {"end": 614.88, "start": 614.32, "text": "optimizing"}, {"end": 615.52, "start": 614.88, "text": "ranking"}, {"end": 616.2, "start": 615.52, "text": "measures"}, {"end": 616.6, "start": 616.2, "text": "as"}, {"end": 616.88, "start": 616.6, "text": "well."}, {"end": 617.04, "start": 616.88, "text": "So"}, {"end": 617.16, "start": 617.04, "text": "here"}, {"end": 617.2, "start": 617.16, "text": "are"}, {"end": 617.4, "start": 617.2, "text": "two"}, {"end": 619.0, "start": 617.4, "text": "suggested"}, {"end": 619.72, "start": 619.0, "text": "readings."}, {"end": 620.16, "start": 619.72, "text": "One"}, {"end": 620.84, "start": 620.16, "text": "is"}, {"end": 621.0, "start": 620.84, "text": "some"}, {"end": 621.68, "start": 621.0, "text": "chapters"}, {"end": 621.84, "start": 621.68, "text": "of"}, {"end": 622.04, "start": 621.84, "text": "this"}, {"end": 622.32, "start": 622.04, "text": "book"}, {"end": 622.32, "start": 622.32, "text": "where"}, {"end": 622.32, "start": 622.32, "text": "you"}, {"end": 623.16, "start": 622.32, "text": "can"}, {"end": 623.4, "start": 623.16, "text": "find"}, {"end": 623.56, "start": 623.4, "text": "more"}, {"end": 624.2, "start": 623.56, "text": "discussion"}, {"end": 625.04, "start": 624.2, "text": "about"}, {"end": 625.76, "start": 625.04, "text": "evaluation"}, {"end": 626.24, "start": 625.76, "text": "measures."}, {"end": 626.36, "start": 626.24, "text": "The"}, {"end": 626.96, "start": 626.36, "text": "second"}, {"end": 627.56, "start": 626.96, "text": "is"}, {"end": 628.12, "start": 627.56, "text": "a"}, {"end": 628.68, "start": 628.12, "text": "paper"}, {"end": 629.04, "start": 628.68, "text": "about"}, {"end": 629.12, "start": 629.04, "text": "the"}, {"end": 629.96, "start": 629.12, "text": "comparison"}], "text": " from which you can obtain insights for improving the method. Finally, sometimes ranking may be more appropriate, so be careful. Sometimes categorization tasks may be better framed as a ranking task. And there are machine learning methods for optimizing ranking measures as well. So here are two suggested readings. One is some chapters of this book where you can find more discussion about evaluation measures. The second is a paper about the comparison"}, {"chunks": [{"end": 630.28, "start": 630.0, "text": "of"}, {"end": 630.96, "start": 630.28, "text": "different"}, {"end": 631.56, "start": 630.96, "text": "approaches"}, {"end": 631.56, "start": 631.56, "text": "to"}, {"end": 631.92, "start": 631.56, "text": "texture"}, {"end": 632.64, "start": 631.92, "text": "categorization"}, {"end": 632.8, "start": 632.64, "text": "and"}, {"end": 633.08, "start": 632.8, "text": "it"}, {"end": 633.76, "start": 633.08, "text": "also"}, {"end": 634.04, "start": 633.76, "text": "has"}, {"end": 634.64, "start": 634.04, "text": "excellent"}, {"end": 635.16, "start": 634.64, "text": "discussion"}, {"end": 635.28, "start": 635.16, "text": "of"}, {"end": 635.64, "start": 635.28, "text": "how"}, {"end": 635.64, "start": 635.64, "text": "to"}, {"end": 636.52, "start": 635.64, "text": "evaluate"}, {"end": 637.04, "start": 636.52, "text": "the"}, {"end": 637.8, "start": 637.04, "text": "texture"}, {"end": 651.88, "start": 637.8, "text": "categorization."}], "text": " of different approaches to texture categorization and it also has excellent discussion of how to evaluate the texture categorization."}]}}