- Oggetto:
nlp4twitter
- Oggetto:
Academic year 2015/2016
- Type
- A scelta dello studente
- Delivery
- Tradizionale
- Language
- Inglese
- Prerequisites
- 1. Linux, bash, markdown and python (I could give a short tutorial on this as well, but c'mon :)
2. Some knowledge of probability, but the course will be selfcontained. Perhaps many of the
participants have already read Jurafsky and Martin, that should be enough.
- Oggetto:
Sommario del corso
- Oggetto:
Course objectives
In this tutorial, we will analyze some of the most recent natural language processing techniques for
extracting detailed information from tweets.
- Oggetto:
Learning assessment methods
1. A project, 80% of the final grade (preferably with some issues in Italian)
2. A presentation of an assigned paper, 20% of the final grade
Given that my Italian is nonexistent, and that there's always much more work in English, we will try to
make this course a good contribution to Italian NLP.
- Oggetto:
Program
We will cover several topics in this course. Some of them include:
1. POSTagging
2. Sentiment analysis/classification
3. Polarity
4. Namedentity recognition
5. Event detection
6. topic identification/modelling
7. Latent semantic analysis
8. Dirichlect allocation
9. Language modelling
10. "interestingness"
11. ...
to name but a few of what we might cover. To this, we have to add the "high performance" part of it:
1. Frameworks: hadoop, spark
2. Parallel programming: parallel (bash), pyparallel (python)
This is simply a list of topics, I'd be very interesting in hearing what students might be interested in as
well. Shoot me an email!
Suggested readings and bibliography
- Oggetto:
http://leoferres.github.io/nlp.html
- Oggetto:
Note
The first class will be a tutorial on using the following tools:
1. Jupyter notebooks
2. NLTK & spaCY
3. Scikit learn
4. Matplotlib
and we will probably use some other tools in the Python stack.
- Oggetto: