Skip to content

Instantly share code, notes, and snippets.

View RemyLau's full-sized avatar
😶‍🌫️

Remy Liu RemyLau

😶‍🌫️
View GitHub Profile
@RemyLau
RemyLau / parse_gtf.py
Last active February 4, 2023 16:09
Parse GTF file into a dataframe by extracting the attributes from the last (attr) column
import pandas as pd
# GTF files, e.g. https://ftp.ensembl.org/pub/current_gtf/homo_sapiens/Homo_sapiens.GRCh38.108.gtf.gz
# README for GTF: https://ftp.ensembl.org/pub/current_gtf/homo_sapiens/README
raw_df = pd.read_csv("some_file.gtf", sep="\t", comment="#", header=None)
# Parse attributes from the last column
# Example value: 'gene_id "ENSG00000160072"; gene_version "20"; gene_name "ATAD3B"; gene_source "ensembl_havana"; gene_biotype "protein_coding";'
# First convert each to a dict: {'gene_id': 'ENSG00000160072', 'gene_version': '20', 'gene_name': 'ATAD3B', 'gene_source': 'ensembl_havana', 'gene_biotype': 'protein_coding'}
# Then combine the list of dict into a dataframe
@RemyLau
RemyLau / pytorch_loss_brief.py
Last active March 20, 2022 20:11
Understanding the difference between cross entropy and negative log-likelihood loss as implemented in PyTorch (brief version)
import torch
torch.manual_seed(0)
# Binary setting ##############################################################
print(f"{'Setting up binary case':-^80}")
z = torch.randn(5)
yhat = torch.sigmoid(z)
y = torch.Tensor([0, 1, 1, 0, 1])
print(f"{z=}\n{yhat=}\n{y=}\n{'':-^80}")
@RemyLau
RemyLau / pytorch_loss.py
Last active March 12, 2026 05:33
Understanding the difference between cross entropy and negative log-likelihood loss as implemented in PyTorch
import torch
import torch.nn.functional as F
torch.manual_seed(0)
# Binary setting ##############################################################
print(f"{'Setting up binary case':-^80}")
z = torch.randn(5)
yhat = torch.sigmoid(z)
y = torch.Tensor([0, 1, 1, 0, 1])