hi...
currently, i am trying to figure out how to accurately predict the relevancy of an unknown document using the "Bayesian categorization model based on document text" component.
i have created a test protocol to try to understand how the Bayesian categorization model works and how accurate it is.
i have attached the test protocol (Forum-Test_Title.xml) here.
in this protocol, the good list comprises of entries related to "electric vehicle" while the bad list are entries related to "HIV" (the list is "patent list.xml").
a model is created based on analysing the titles of the list.
i was wondering, after creating the model and re-appplying it in the source list, why do i get entries of the good list marked bad and entries of the bad list marked good?
thanks.