Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why I can't get the same result with summary(hardham.res) #17

Open
j1smile opened this issue Oct 28, 2013 · 0 comments
Open

why I can't get the same result with summary(hardham.res) #17

j1smile opened this issue Oct 28, 2013 · 0 comments

Comments

@j1smile
Copy link

j1smile commented Oct 28, 2013

when I repeat the code ,I just get the result as follows:
...
hardham.res <- ifelse(hardham.spamtest > hardham.hamtest,
TRUE,
FALSE)
summary(hardham.res)
...
the result is :
Mode FALSE TRUE NA's
logical 243 6 0

I also try:
hardham.res <- ifelse(hardham.spamtest == hardham.hamtest,
TRUE,
FALSE)
the result is:
Mode FALSE TRUE NA's
logical 21 228 0

that means most of the results is equal .

so i double if it's the floating overflow fault. then I change the classify.email function as below:
classify.email <- function(path, training.df, prior = 0.5, c = 1e-6)
{

Here, we use many of the support functions to get the

email text data in a workable format

msg <- get.msg(path)
msg.tdm <- get.tdm(msg)
msg.freq <- rowSums(as.matrix(msg.tdm))

Find intersections of words

msg.match <- intersect(names(msg.freq), training.df$term)

Now, we just perform the naive Bayes calculation

if(length(msg.match) < 1)
{
return((log10(prior)+length(msg.freq)_log10(c))) # return(prior * c ^ (length(msg.freq)))
}
else
{
match.probs <- training.df$occurrence[match(msg.match, training.df$term)]
return((log10(prior)+sum(log10(match.probs)) + (length(msg.freq) - length(msg.match))_log10(c))) # return(prior * prod(match.probs) * c ^ (length(msg.freq) - length(msg.match)))
}
}

this time I get the result:

hardham.res <- ifelse(hardham.spamtest > hardham.hamtest,

  • TRUE,
  • FALSE)
    summary(hardham.res)
    Mode FALSE TRUE NA's
    logical 80 169 0

my god the conclusion is just error.
who has encounter the same problem ?
where have I make the mistake?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant