``And once the storm is over you won't remember how you made it through, how you managed to survive. You won't even be sure, in fact, whether the storm is really over. But one thing is certain. When you come out of the storm you won't be the same person who walked in. That's what this storm's all about.''
-Haruki Murakami
----------------------------------------------
The lab is excited to announce opportunities for masters and undergraduate research credits. Please see the open positions page for details.
----------------------------------------------
From left: Patrick Kreitzberg, Oliver Serang, Kyle Lucke, Jake Pennington
Congratulations to Kyle Lucke for the publication for his first first-author paper (Lucke, Pennington, Kreitzberg, Käll, and Serang) on the topic of layer-ordered heaps as a faster alternative to sorting hypotheses when performing non-parametric statistical tests!
----------------------------------------------
Members hiked in the mountains above Potomac,
MT. Afterward, they played Kubb (Viking
chess).
STAGE | MATCH | WINNER |
---|---|---|
SEMIFINAL | PATRICK+ANNIKA vs ROOSTER+KYLE | ROOSTER+KYLE |
FINAL | ROOSTER vs KYLE | ROOSTER |
GEWINNER: Jake (Rooster) Pennington won in a final closely contested with Kyle Lucke. Oliver's participation drew scrutiny after he'd awarded himself Hero of the Soviet Union an ambitious fourth time.
----------------------------------------------
is the world's fastest isotope engine. (Also, since we are, as of now, alone in the universe, it is also the fastest isotope engine in the universe [braggable].)
Built with newly developed layer-ordered heap theory for selection on Cartesian products, NeutronStar generates the top million peaks of complicated compounds with many isotope states per in element less than a second.
Read the Analytical Chemistry publication, Kreitzberg, Pennington, Lucke, and Serang.
Check out the software here.
----------------------------------------------
HUCKLEBERRY PICKING & LAB PICNIC (FINAL STANDINGS):
COMPETITOR | COLD WATER ENDURANCE | MOST HUCKLEBERRIES | LARGEST HUCKLEBERRY | SUM |
---|---|---|---|---|
PATRICK | 3 | 1 | 4 | 8 |
KYLE | 3 | 4 | 3 | 7 |
ROOSTER | 1 | 2 | 1 | 4 |
ANNIKA | 2 | 3 | 2 | 7 |
GEWINNER: Jake (Rooster) Pennington took first prize. Oliver's performance was disqualified after he tested positive for rooibos tea.
----------------------------------------------
Kyle Lucke's work on cardinal Bayesian models for protein inference was a finalist for the best of the University of Montana's GradCon 2020!
----------------------------------------------
Using the newly invented layer-ordered heap, the lab recently created a new optimal method for selecting the top values from the Cartesian product (or sum) $$X+Y.$$ This means, given two lists $$X,Y,$$ it finds the smallest k terms of the form $$X_i+Y_j.$$ If the lists are length n, it finds these smallest k values in $$O(n+k).$$ This approach is simple and performs well in practice. Read the preprint here.
----------------------------------------------
Patrick Kreitzberg, Kyle
Lucke, and Oliver Serang created the first efficient
methods for performing selection on the Cartesian sum
$$Y=X_1+X_2+\cdots+X_m.$$ The technique is important for
Bayesian inference under additive dependencies, computing
the most abundant isotope peaks of a compound, computing
the most efficient k configurations to build a
product with m independent supply lines, and
more.
By using a new data structure called a ``layer-ordered
heap,'' one can compute the top k terms of a
Cartesian product of m arrays, each of
length n, in $$o(m\cdot n + k\cdot m)$$
time. Consider that simply loading the data costs m
n and accessing each of the m indices at each
of the k top values costs k m; this is a
remarkable result.
Read
the preprint
here.
----------------------------------------------
Exploring Scientific Wilderness is back for its second
season! Join us (Patrick Kreitzberg, Kyle Lucke, Max
Thibeau, Jake Pennington, and Oliver Serang) for over
20 new episodes, including science tutorials,
interviews with scientists, and more! Episodes will be
released weekly on Sunday evenings for the next six
months.
This season features music from the phenomenal
synthwave record label Italians Do It Better.
----------------------------------------------
Oliver Serang's textbook, Introduction to Cybersecurity, is now online.
Check it out here.
----------------------------------------------
Lab members each chose a set of present human proteins, and these were used to simulate an MS/MS data set. Using only target-decoy information and intuition, lab members built models and performed protein inference. Using the secret present proteins from each data set, the participants' submissions were ranked (via AUC of the ROC at a q-value <0.1). The participant with the best sum of ranks over the data sets was named the winner.
FINAL STANDINGS:
COMPETITOR | JAKE DATA | KYLE DATA | SEAN DATA | PATRICK DATA | MAX DATA | SUM |
---|---|---|---|---|---|---|
JAKE | ---- | 2 | 2 | 2 | 1 | 7 |
KYLE | 3 | ---- | 1 | 1 | 3 | 8 |
SEAN | 4 | 4 | ---- | 3 | 2 | 13 |
PATRICK | 2 | 1 | 3 | ---- | 4 | 10 |
MAX | 1 | 3 | 4 | 4 | ---- | 12 |
GEWINNER: Jake Pennington took first prize (Little Debbie chocolate cupcakes) with Kyle Lucke in a photo-finish for second prize (Little Debbie powdered sugar donuts).
----------------------------------------------
Congratulations to Patrick Kreitzberg for the publication for his first first-author paper (Kreitzberg, Bern, Shu, Yang, and Serang) on the topic of his masters thesis!
----------------------------------------------
Having won its bid for the 2019 Serang Lab Olympics, Missoula hosted competitors at Bonner Park to compete in a triathlon of lab events: The first event is the longest frisbee throw with successful catch. Participants select a partner of choice, who gets a four second head start running before the competitor throws. Completed catches are marked with sparkling water (competitor's flavor of choice), and the best of three throws was used for scoring. The second event was the egg toss. In this case, competitors must work together. Each completed throw results in one step back for all remaining competitors, until only one egg remains. The last event is the crab walk race. Going to the telephone pole would be too easy (You know what else is easy? TV dinners! For once, ask the most of yourself!): competitors must go around the telephone pole and return.
FINAL STANDINGS:
COMPETITOR | FRISBEE RANK | EGG RANK | CRAB RANK | RANK SUM |
---|---|---|---|---|
MAX | 1 | 3 | 4 | 8 |
JAKE | 3 | 2 | 8 | 13 |
SEAN | 4 | 2 | 1 | 7 |
SARAH | 5 | 1 | 5 | 11 |
KYLE | 6 | 1 | 6 | 13 |
BLAKE | 6 | 3 | 3 | 12 |
PATRICK | 6 | 1 | 3 | 12 |
OLIVER | -- | -- | -- | -- |
GEWINNER: Sean Rice wins the Strawberry-Watermelon bubble yum. Oliver was disqualified due to his suspiciously good performance after testing positive for peppermint tea.
----------------------------------------------
Oliver Serang has received an NSF CAREER award! This
grant will fund the development of new combinatorics
methods, K-12 curriculum on teaching mathematics and
computer science, and a new season of the Exploring
Scientific Wilderness podcast.
You can follow the fruits of this grant
on this page.
----------------------------------------------
Patrick Kreitzberg and Oliver Serang presented at ASMS
2019 in Atlanta, where Patrick got his first taste of
Szechuan food in his life.
To read Patrick's masters thesis (on computing de
novo neutral loss alphabets from mass spectra),
click here. To
get the code (MIT licese),
visit https://bitbucket.org/orserang/peak-bagger.
To see Kyle Lucke et al.'s poster (which was
presented by Oliver) on the the hitchhiking proteins
during protein inference, click here.
----------------------------------------------
Patrick Kreitzberg will defend his Masters Thesis on Wednesday, June 19th at 1PM in Social Sciences 362. Read his thesis here!
----------------------------------------------
The department of Computer Science at the University of Montana will now offer Cybersecurity (both undergraduate and graduate level)!
----------------------------------------------
----------------------------------------------
Patrick Kreitzberg's work on meta de novo mass spectrometry methods, "The Alphabet Projection of Mass Spectrometry Data", was a finalist for the best of the University of Montana's GradCon 2019!
----------------------------------------------
To celebrate the new year, lab members jumped into the snowy Clark Fork and then everyone went to warm up with some ramen and then cool back down with a quick game of Race for the Galaxy over ice cream.
Only 170 days until the days start getting shorter again!
----------------------------------------------
Lab members went camping in the mountains east of Missoula. It was damp, but everyone worked together to strip the wood, saw/baton logs, and build a good fire before the snow came.
This cozy season, which is viewed as the time when the veil between the living and the dead is its thinnest, proved to be a perfect time to introduce everyone to "cheese dogs".
----------------------------------------------
Oliver Serang's textbook, Algorithms in Python is now online.
Check it out here.
----------------------------------------------
During an afternoon of Patrick and Oliver playing fetch with Patrick's dog Butters, Butters was recorded in a video that is slated for use in offial promotion for the University of Montana.
----------------------------------------------
The Serang Lab has been awarded a COBRE grant. This grant will be used to hire UM graduate students to build new graphical models for two cutting-edge problems from computational biology.
----------------------------------------------
The University of Montana has awarded the Serang Lab a pilot project grant. This grant will fund a UM computer science student to work on an exciting project developing EvergreenForest, a new graphical models library.
----------------------------------------------
Oliver Serang's textbook, Code Optimization in C++11 is now online.
Check it out here.
----------------------------------------------
Exploring Scientific Wilderness is a new podcast miniseries with short and sweet pieces of scientific brain candy.
----------------------------------------------
After years of work, the beta version of
EvergreenForest has been released! It includes a fast,
lazy implementation of trimmed convolution trees, a
fast in-house FFT written in C++11 (using the lab's
work on methods for performing bit-reversed
permutation), and lots of other Waldmeister-flavored
goodies to activate your almonds.
You can read more about it and download it
here.
----------------------------------------------
Trimmed convolution trees use sparsity in the solution
space exploit sparsity in the input priors and
likelihood of a convolution tree to achieve an
algorithmic and practical speedup. For example, when
all input priors are $$X_i \in \{0,1\}$$ and the
likelihood $$Y=X_1 + X_2 + \cdots X_n \in \{0,1\},$$
all posteriors can be solved with trimmed convolution
trees in $$O(n)$$ rather than the $$O(n \log(n)
\log(n))$$ required by non-trimmed convolution
trees.
Trimmed convolution trees generalize this to use
narrower priors and likelihoods, even when some priors
have larger support than others or when the likelihood
distribution has are larger support than all priors,
smaller support, or anything, making the first method
$$\in o(nk \log(nk) \log(n))$$ on some problems with
$$n$$ inputs each with $$k$$ bins.
The paper (in review) includes a demo that uses
trimmed convolution trees to estimate posterior
probabilities for peoples' orders given their total
ice cream bill; the demo is, of course, written using
the menu from
Missoula's
own Big Dipper Ice Cream.
----------------------------------------------
The bit-reversed permutation is an important task for
implementing an in-place FFT library. This paper
explores several bit reversal methods on the x86
architecture, including new bit twiddling tricks and a
proposed template-recursive, cache-oblivious method,
which achieves state-of-the-art performance.
The paper draft can be found
here. The
repository (linked in the paper) includes translations
in the mother tongue of each student who participated
(Chinese, German, Turkish, and Spanish) to help
students throughout the world participate in the joy
of scientific computing. So when someone says, ``Bit
reversal? How?'' you can say, ``很好!''
----------------------------------------------
Florian Heyl, who is doing his masters Praktikum in the
group, has just had his first paper accepted for
publication. The paper (Heyl & Serang 2017,
PROGRAMMING) introduces and benchmarks the
``template-recursive iteration over tensors'' (TRIOT)
design pattern in C++11. The method enables easy
vectorizing over tensors of different shaps and does not
require tensor dimensions to be fixed at compile
time.
Download the free header-only
library here.
----------------------------------------------
The lab's
second paper on the
topic (Pfeuffer
& Serang 2016, Journal of Machine Learning Research)
builds upon the original idea, looking at the curve of
(modeled using orthogonal polynomials) of a constant
number of $$L_p$$ spaces. The way that these spaces bend
toward the true maximum can allow us to estimate the
maximum better than any particular numerically stable
p-norm.
The paper highlights the approach with faster prediction
of U.S. unemployment given S&P 500 index data (via a
fast Viterbi path computation).
----------------------------------------------
Max-convolution, a cousin of standard convolution but on
the semiring $$( \times, \max )$$ (consider multiplying
two polynomials but taking the maximum coefficient for
every term rather than the sum of coefficients for that
term), arises frequently in combinatorics and in
Bayesian inference. In 2006, Bremner et al. discovered
the first solution $$\in o(n^2);$$ theirs is a beautiful
method, but it's only slightly sub-quadratic and relies
on a reduction to APSP, and it is not fast enough for
large-scale use in Bayesian networks. However, this
problem can be approximated numerically by using $$L_p$$
space, where FFT can be used in $$\in O(n
\log(n))$$ (Serang
2015, Journal of Computational Biology). The method
can also be used to improve hidden Markov models (HMMs)
with with k states from $$O(k^2 n)$$ to $$O(k \log(k)~
n),$$ whenever the transition matrix is a Toeplitz
matrix.
Combined with the probabilistic convolution tree, the
method can be used for max-product inverence when $$Y =
X_1 + X_2 + \cdots + X_n.$$ This can be used to solve a
generalization of the knapsack problem on
integer-discretized supports.
----------------------------------------------
The probabilistic convolution tree (Serang 2014, PLOS ONE) is a new dynamic programming algorithm that solves a class of probabilistic problems, which are equivalent to a generalization of subset-sum on integer-discretized supports. When $$Y = X_1 + X_2 + \cdots + X_n$$ and each $$X_i \in \{0, 1, \ldots k-1\}$$ it computes all marginals simultaneously with a runtime $$\in O(n\cdot k \log(n\cdot k) \log(n)).$$
----------------------------------------------