Skip to content

Instantly share code, notes, and snippets.

View mikelove's full-sized avatar

Michael Love mikelove

View GitHub Profile
@mikelove
mikelove / make_variant_table.R
Created December 18, 2025 14:09
backup for variant table creation script for liver IGVF
library(here)
library(plyranges)
source(here("analysis/common/import.R"))
# This script imports variant-level summary statistics and adds QC variables
# e.g.
# - sufficient activity of at least one of the alleles
# - balanced representation of reference and
# alternate alleles in the input DNA
@mikelove
mikelove / alt_utr.R
Created October 31, 2025 16:45
Exploring alternative UTR code with GenomicFeatures and plyranges
library(GenomicRanges)
library(GenomicFeatures)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
ebt <- exonsBy(txdb, by="tx")
cbt <- cdsBy(txdb, by="tx")
fubt <- fiveUTRsByTranscript(txdb)
tubt <- threeUTRsByTranscript(txdb)
@mikelove
mikelove / deseq2_fix_priors.R
Created September 24, 2025 18:45
DESeq2 on second set of rows using fixed priors
library(DESeq2)
dds <- makeExampleDESeqDataSet()
dds_leaf <- dds[1:500,]
dds_inner <- dds[501:1000,]
dds_leaf <- DESeq(dds_leaf)
res_leaf <- lfcShrink(dds_leaf, coef="condition_B_vs_A", type="apeglm")
# this runs DESeq2 on inner nodes but using priors from leaves
sizeFactors(dds_inner) <- sizeFactors(dds_leaf)
dds_inner <- estimateDispersionsGeneEst(dds_inner)
@mikelove
mikelove / hookup-tidySE-plyxp
Last active June 26, 2025 15:55
hooking up tidySE to plyxp, example case
library(tidySummarizedExperiment)
library(plyxp)
nfeat <- 1e4
nsamp <- 1e3
coldata <- data.frame(x=rep(1:2,length=nsamp))
assay <- matrix(rnorm(nfeat*nsamp), ncol=nsamp)
rownames(assay) <- paste0("f",1:nfeat)
colnames(assay) <- paste0("s",1:nsamp)
library(dplyr)
library(tidyr)
library(purrr)
simulate_genotype <- function(maf, n) rbinom(n, 2, maf)
simulate_data <- function(beta, maf, n) {
data.frame(
genotype = simulate_genotype(maf, n)
) |>
@mikelove
mikelove / tiles.R
Created March 17, 2025 13:56
Adding metadata to tiles
> x <- data.frame(seqnames="chr1", start=c(1,101), width=100, foo=c("bar","boo")) |> as_granges()
> x
GRanges object with 2 ranges and 1 metadata column:
seqnames ranges strand | foo
<Rle> <IRanges> <Rle> | <character>
[1] chr1 1-100 * | bar
[2] chr1 101-200 * | boo
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> x |> tile_ranges(25)
@mikelove
mikelove / spline_GP_nonuniform_x.R
Created January 9, 2025 20:13
splines and GP disagreeing
# Created by chat GPT
# Load necessary libraries
library(splines)
library(GPfit)
# Set seed for reproducibility
set.seed(123)
# Generate non-uniform data
@mikelove
mikelove / biocmask_examples.R
Last active August 7, 2024 23:50
Trying out Justin's biocmask implementation of tidyomics for SE
load_all("../biocmask")
suppressPackageStartupMessages(library(airway))
data(airway)
rowData(airway)$group <- sort(rep(1:10, length=nrow(airway)))
library(tibble)
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(purrr))
airway |>
@mikelove
mikelove / mpra_json.R
Last active July 23, 2024 13:20
file to make JSON format
library(jsonlite)
library(readr)
dat <- read_csv("IGVF MPRA FG designed sequence metadata - Sheet1.csv")
dat$constraints <- as.list(dat$constraints)
dat$constraints[[1]] <- list(required=TRUE)
dat$constraints[[2]] <- list(required=TRUE)
dat$constraints[[3]] <- list(
required=TRUE,
enum=strsplit(sub(".*enum: \\[(.*)\\]","\\1",dat$constraints[[3]]),", ")[[1]]
)
@mikelove
mikelove / fluent-genomics-v2.qmd
Last active June 11, 2024 20:54
fluent genomics update
---
title: "Differential chromatin accessibility and gene expression"
format: html
---
# Differential expression from RNA-seq
```{r}
#| eval: FALSE
dir <- system.file("extdata", package="macrophage")