Skip to content

Instantly share code, notes, and snippets.

View CelestineZYJ's full-sized avatar

YujiZhang CelestineZYJ

View GitHub Profile
@emaadmanzoor
emaadmanzoor / ExpandEdinburghFSDCorpus.md
Last active October 31, 2020 20:30
Expand the Edinburgh Twitter FSD corpus

Expand The Edinburgh Twitter FSD Corpus

The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:

  • Respect the Twitter API rate limits and throttle API hits.
  • Don't hit the API for already expanded tweet ID's, so you can resume tweet expansion after stopping midway.
  • Parse the API response and dump it into the correct column in the sqlite3 database.
  • Gracefully handle exceptions while acquiring tweets from the API.
  • Wrap version 1.1 of the Twitter API.
  • Start from a specified tweet ID, assuming the input file is sorted in increasing order of tweet ID.
@fabianp
fabianp / ranking.py
Last active December 24, 2025 18:54
Pairwise ranking using scikit-learn LinearSVC
"""
Implementation of pairwise ranking using scikit-learn LinearSVC
Reference:
"Large Margin Rank Boundaries for Ordinal Regression", R. Herbrich,
T. Graepel, K. Obermayer 1999
"Learning to rank from medical imaging data." Pedregosa, Fabian, et al.,
Machine Learning in Medical Imaging 2012.