Wednesday, January 19, 2022

All of Shakespeare

 Ploughing through on CS50 AI.






So how does a computer apprehend the meaning of texts. Short
answer: it doesn't really. One ends up instructing the computer to
tokenize a text in a certain way - that is, work with words or other units -
and train to create a model. Below, Markovify is given access to the entire
corpus of Shakespeare's works, and asked to produce 5 sentences on a 
Markov chain model. Kinda silly result, but each word proposed does follow
the previous two somewhere in Shakespeare...






                                                                *     *     *

A more mudane problem, which a computer can handle, is teling whether 

an email is serious or spam; or whether a product review is favourable or

not.

One gives the computer a set of reviews to train on, and build a model

on word frequency. The computer is told which reviews are favourable, and 

which are not. It is then asked to classify a new review.




A naive Baysian analysis, multiplying the probabilities on major words, gives

a 68% chance of being favourable to "My grandson loved it". Good work, nltk!!

Natural Language Toolkit











No comments: