Examining Variations of Prominent Features in Genre Classification



Table 5. Confusion matrix: Image NB on Dataset II.

classified as --->

AM

BF

BR

M

P

T

AM

10

4

4

20

25

36

BF

1

2

0

4

7

15

BR

17

9

5

16

32

21

M

6

0

1

81

0

11

P

5

1

3

8

48

2

T

1

2

0

5

1

91

Table 6. Confusion matrix: Style RF on Dataset II.

classified as --->

AM

BF

BR

M

P

T

AM

74

1

8

3

4

9

BF

0

24

0

0

2

0

BR

7

0

85

3

4

1

M

4

0

1

94

0

0

P

6

0

7

3

48

3

T

9

1

2

0

4

84

Table 7. Confusion matrix: Rainbow SVM on Dataset II.

classified as --->

AM

BF

BR

M

P

T

AM

41

1

8

3

18

28

BF

0

23

1

0

4

1

BR

6

0

61

3

28

2

M

3

1

2

87

4

3

P

3

0

5

3

53

3

T

2

0

1

0

8

89

Although the results of the experiments suggest style RF
as the overall best performer on the two datasets, they do
not identify genre classes for each classifier on which the
classifier consistently outshines the other two classifiers.
However, upon closer examination, the results do show
that the binary partition of the genre classes, into classes
with the three best performance and three worst
performance, is preserved across the experiments on the
two datasets: these partitions are (Minutes, Periodicals,
Thesis) and (Academic Monograph, Book of Fiction,
Business Report) for image NB, and (Book of Fiction,
Minutes, Thesis) and (Academic Monograph, Business
Report, Periodicals) for style RF and Rainbow SVM.

The general low level performance of the image
features is partly due to the crude image representation. In
the current model, the image features only capture the
first page of the document, and each pixel value is
strongly anchored to its position. This representation
could be improved to combine representations of several
pages of the document and to soften the positional
information to embody the general shape or topology of
the image. Likewise, for style, the size of the dataset and
the variety of the documents in the datasets used for
training and compiling word lists should be further
examined for refinement.



More intriguing information

1. Studying How E-Markets Evaluation Can Enhance Trust in Virtual Business Communities
2. AJAE Appendix: Willingness to Pay Versus Expected Consumption Value in Vickrey Auctions for New Experience Goods
3. Insurance within the firm
4. Firm Closure, Financial Losses and the Consequences for an Entrepreneurial Restart
5. Les freins culturels à l'adoption des IFRS en Europe : une analyse du cas français
6. The name is absent
7. The Composition of Government Spending and the Real Exchange Rate
8. Delayed Manifestation of T ransurethral Syndrome as a Complication of T ransurethral Prostatic Resection
9. Valuing Farm Financial Information
10. The name is absent
11. ESTIMATION OF EFFICIENT REGRESSION MODELS FOR APPLIED AGRICULTURAL ECONOMICS RESEARCH
12. The name is absent
13. Institutions, Social Norms, and Bargaining Power: An Analysis of Individual Leisure Time in Couple Households
14. Can a Robot Hear Music? Can a Robot Dance? Can a Robot Tell What it Knows or Intends to Do? Can it Feel Pride or Shame in Company?
15. The name is absent
16. The name is absent
17. Tastes, castes, and culture: The influence of society on preferences
18. The name is absent
19. Change in firm population and spatial variations: The case of Turkey
20. Brauchen wir ein Konjunkturprogramm?: Kommentar