this repo has no description
1opam-version: "2.0"
2authors: "Francois Berenger"
3maintainer: "unixjunkie@sdf.org"
4homepage: "https://github.com/UnixJunkie/linwrap"
5bug-reports: "https://github.com/UnixJunkie/linwrap/issues"
6dev-repo: "git+https://github.com/UnixJunkie/linwrap.git"
7license: "BSD-3-Clause"
8build: ["dune" "build" "-p" name "-j" jobs]
9install: ["cp" "bin/ecfp6.py" "%{bin}%/linwrap_ecfp6.py"]
10depends: [
11 "base-unix"
12 "batteries" {>= "3.0.0"}
13 "bst"
14 "conf-liblinear-tools"
15 "cpm" {>= "11.0.0"}
16 "dokeysto_camltc"
17 "dolog" {>= "6.0.0"}
18 "dune" {>= "1.10"}
19 "minicli" {>= "5.0.0"}
20 "molenc"
21 "parany" {>= "11.0.0"}
22]
23# the software can compile and install without the depopts.
24# however, some tools and options will not work anymore at run-time
25depopts: [
26 "conf-gnuplot"
27 "conf-python-3"
28 "conf-rdkit"
29]
30synopsis: "Wrapper on top of liblinear-tools"
31description: """
32Linwrap can be used to train a L2-regularized logistic regression classifier
33or a linear Support Vector Regressor.
34You can optimize C (the L2 regularization parameter), w (the class weight)
35or k (the number of bags, i.e. use bagging).
36You can also find the optimal classification threshold using MCC maximization,
37use k-folds cross validation, parallelization, etc.
38In the regression case, you can only optimize C and epsilon.
39
40When using bagging, each model is trained on balanced bootstraps
41from the training set (one bootstrap for the positive class,
42one for the negative class).
43The size of the bootstrap is the size of the smallest (under-represented)
44class.
45
46usage: linwrap
47 -i <filename>: training set or DB to screen
48 [-o <filename>]: predictions output file
49 [-np <int>]: ncores
50 [-c <float>]: fix C
51 [-e <float>]: fix epsilon (for SVR);
52 (0 <= epsilon <= max_i(|y_i|))
53 [-w <float>]: fix w1
54 [--no-plot]: no gnuplot
55 [-k <int>]: number of bags for bagging (default=off)
56 [{-n|--NxCV} <int>]: folds of cross validation
57 [--mcc-scan]: MCC scan for a trained model (requires n>1)
58 also requires (c, w, k) to be known
59 [--seed <int>]: fix random seed
60 [-p <float>]: training set portion (in [0.0:1.0])
61 [--pairs]: read from .AP files (atom pairs; will offset feat. indexes by 1)
62 [--train <train.liblin>]: training set (overrides -p)
63 [--valid <valid.liblin>]: validation set (overrides -p)
64 [--test <test.liblin>]: test set (overrides -p)
65 [{-l|--load} <filename>]: prod. mode; use trained models
66 [{-s|--save} <filename>]: train. mode; save trained models
67 [-f]: force overwriting existing model file
68 [--scan-c]: scan for best C
69 [--scan-e <int>]: epsilon scan #steps for SVR
70 [--regr]: regression (SVR); also, implied by -e and --scan-e
71 [--scan-w]: scan weight to counter class imbalance
72 [--w-range <float>:<int>:<float>]: specific range for w
73 (semantic=start:nsteps:stop)
74 [--e-range <float>:<int>:<float>]: specific range for e
75 (semantic=start:nsteps:stop)
76 [--c-range <float,float,...>] explicit scan range for C
77 (example='0.01,0.02,0.03')
78 [--k-range <int,int,...>] explicit scan range for k
79 (example='1,2,3,5,10')
80 [--scan-k]: scan number of bags (advice: optim. k rather than w)
81"""
82url {
83 src: "https://github.com/UnixJunkie/linwrap/archive/v9.1.0.tar.gz"
84 checksum: [
85 "sha256=dabf5b1eec310c73a2d61d82275bc9da904653dc2d16ffc40cc5627c1393e04a"
86 "md5=b7bdf8ac6a6a2af8033c6bef82b6c2ca"
87 ]
88}