Source code of Microblog Semantic Topic Identification prototype is published

Our previous study extracts human readable topics given a set of microblog posts. Based on the idea of identifying the topics of a crowd of microblog users, we have recently came up with semantically representing microblog topics for machine consumption. Source code of the prototype is published. To install topic identification approach in a linux machine follow the following steps.

Install R
Make sure that Rscript is running
Install php-cli (Php command line interface) version>5
Make sure that php-curl is installed
Make sure that shell_exec is working in PHP-cli
Obtain a TagMe API key
Download the SBounTI package and extract it in an empty directory
Edit cfg/config.php according to need (such as base urls of resources that will be produced and the TagMe API key)
Obtain a microblog post dataset about 5 thousand posts, either
- in a file format of short texts in each line
- or in a raw file retrieved from Twitter streaming API
Issue command:
- ./sbounti <filename> "<dataset_name>" "<start_date>" "<end_date>"
  for the text file
- ./sbounti <filename> "<dataset_name>"
  for the raw Twitter streaming API file
Where <filename> is the file name of the file that has short messages, <dataset_name> that is used in the explanations of the resources expressed in OWL, <start_date> and <end_date> are valid start and end date-times of the post set in the format as in example: Wed Sep 21 11:01:56 +0300 2016.
The produced OWL file contents are written to STDOUT. So, you may want to redirect the output to a file using "> filename.owl" at the end of the command.
If you have questions please contact Ahmet Yildirim

Ahmet Yıldırım

Source code of Microblog Semantic Topic Identification prototype is published

Search

Subjects

Blog Archive

Popular Posts

Followers

Ahmet Yıldırım