🗳️+👀 A platform to protect elections in a disinformation world.
A platform for monitoring democratic elections and fighting online disinformation.
For a full description, please check the work on which Election Watch is based: MSc Thesis, Twitter Watch's paper.
Since election watch can be deployed in many contexts, this section will be used to list them and the precise code they used, in reverse chronological order:
Event | From | To | Code | Dataset | Dataset contents | Archived endpoint |
---|---|---|---|---|---|---|
Portuguese Presidential Elections, Jan 24th 2021 | Sep 2nd 2020 | Jan 30th 2021 | v1.0 | Dataset | Twitter(tweets=57 155 221 , users=1 115 491 ) |
election-watch-portugal-presidentials-2021 |
When you use any of these datasets on your work, please cite the thesis this is all based on, here's the bibtex
version:
@masterthesis{ramalho2021highlevel,
title={High-level Approaches to Detect Malicious Political Activity on Twitter},
author={Miguel Sozinho Ramalho},
year={2021},
eprint={2102.04293},
archivePrefix={arXiv},
primaryClass={cs.SI}
}
npm run install
npm run generate:gh-pages && npm run deploy
cp example.env .env
and editdocker-compose up
(pass -d
for detached mode)
--noIndexRestore
option:After you download the mongodump zip (in this case from google drive) do
# windows
mongorestore --uri="mongodb://localhost:27017/" /d ew_db .\election-watch-folder\ --gzip
# linux
mongorestore --uri="mongodb://localhost:27017/" -d ew_db ./election-watch-folder --gzip
election-watch-folder
is the folder inside the unzipped directory you have downloaded (contains .bson
files). ew_db
is the name you want your database to have
The current implementation imposes a 30 time to live (ttl) on the tweets collection for storage optimization purposes, hence it is advisable that you either import without indexes (some are useful like the index on tweets.user
) or delete the created_at
index before performing any operation. To import without indexes just append the --noIndexRestore
option.
For password protected do:
mongorestore -u USERNAME -p PASSWORD --authenticationDatabase admin --uri="mongodb://localhost:27017/" -d ew_db ./election-watch-folder
where USERNAME
is typically root
Check pre-commit.com for more pre-commit functionality and then add it to the pre-commit config file.
To run, execute pre-commit run --all-files
.
db.stats(1024*1024*1024).dataSize + " GB";
db.getCollection('tweets').aggregate([
{$match: {"original": true}},
{$unwind: '$user_mentions'},
{ $group: {
_id: '$user_mentions',
count: {$sum: 1}
-- count: {$sum: { $add : ['$favorite_count', '$retweet_count']}}
}},
{$sort: {count: -1}},
{$limit: 50},
{ $project: { count: 1, _id: '$_id' }}
]).map(x=>x._id + " - " + db.getCollection('users').find({_id: x._id}).map(y=>y.screen_name) + " - " + x.count).reduce((acc, prev) => acc + "\n" + prev)
db.getCollection('tweets').aggregate([
{$match: {"original": true}},
{$unwind: '$hashtags'},
{ $group: {
_id: '$hashtags',
count: {$sum: 1}, // em quantos originais aparecem
countWeight: {$sum: { $add : ['$favorite_count', '$retweet_count']}} // retweets+favorite
}},
{$project: {
impact: { $divide: [ "$countWeight", "$count" ] },
count: 1, countWeight: 1, _id: '$_id'
}},
{$match: {count : {$gte: 100}}},
{$sort: {impact: -1}}, {$limit: 50},
{ $project: { count: 1, countWeight: 1, impact: 1, _id: '$_id'}}
]).map(x=>"#" + x._id + "(" + x.impact + ") - " + x.count + " - " + x.countWeight).reduce((acc, prev) => acc + "\n" + prev)
db.getCollection('users').update({}, {$unset: {private: 1, time_private: 1}}, {multi: true})
db.getCollection('users').count({followers_count: {$gt: 500000}, depth: {$gt: 0}})
db.getCollection('tweets').find({"created_at": {$gte: new Date("2020-09-18"), $lt: new Date("2020-09-19")}, hashtags: {$in: ["HASHTAG"]}})
db.getCollection('users').find({followers_count: {$gte: 100000}}, {_id: 1}).map(function(item){ return item._id; }).reduce(function(acc, prev){return acc + "," + prev})
db.getCollection('users').find({followers_count: {$gte: 100000}, depth: {$gt: 0}}).map(x=>x.screen_name + " - " + x.follows_political + " - " + x.follows_news);
db.getCollection('tweets').find({hashtags: {$exists: true}})
.forEach(function(tweet) {
tweet.hashtags = tweet.hashtags.map(function(h) {
return h.toLowerCase();
});
db.getCollection('tweets').save(tweet);
})
db.getCollection('users').count({
"count_parsed_tweets": {"$gte": 25},
"most_common_language": {"$not": {"$in": ["pt", "und"]}},
$and: [
{$or: [
{follows_political: {$lte: 2}},
{follows_political: {$exists: false}}
]},
{"$or": [
{"tweeted_languages.pt": {"$exists": false}},
{"tweeted_languages.pt": {"$lte": 5}}
]}
]
})//.limit(200).map(x=>x.screen_name + ":" + x.follows_political + "," + x.follows_news + " - " + x.description).reduce((acc, prev) => acc + "\n" + prev)
```
</details>