Server file structure
Server Routing and URLs
Data representation and structure
Parses the uploaded files
K-means API and usage
Tree/Graph generation
Data Projection API and usage
Functions used all over the project
This is the file structure of the main directory:
QCRI/
├── bin/
│ ├── www
│ ├── RUNS THE SERVER
├── docs/
│ ├── CONTAINS THE DOCS YOU ARE VIEWING RIGHT NOW
├── public/
│ ├── ALL FILES PUBLICLY SERVED ARE STORED HERE
│ ├── javascript/
│ ├── stylesheets/
├── routes/
│ ├── CONTAINS SERVER ROUTES AND ALGORITHMS
│ ├── index.js
│ ├── kmeans.js
│ ├── tree.js
│ ├── projection.js
│ ├── file-parser.js
│ ├── file-writer.js
│ ├── redis-util.js
│ ├── exporter.js
│ ├── matrix_utils.js
├── views/
│ ├── CONTAINS HTML PAGES THAT ARE SERVED TO THE USER
│ ├── index.ejs
│ ├── dataview.ejs
│ ├── error.ejs
├── redis/
│ ├── CONTAINS SCRIPTS THAT TAKE CARE OF REDIS
│ ├── install_redis.sh
│ ├── start_redis.sh
├── uploads/
│ ├── CONTAINS UPLOADED FILES BY THE USER
└── app.js ─ SERVER CONFIGURATION FILE
└── package.json ─ DEPENDENCIES FILE
└── run_server.js ─ SCRIPT TO RUN THE SERVER
Most of the URLs, which are called routes, are found inside routes/index.js
/
Renders the homepage
/uploads
Computes the results and renders the visualization box
/kUpdate
Runs the K means update and returns the result using AJAX
/download
Incomplete
Exports all the files and returns a zip file (zips all the files together)
/projected
Projects the data and returns it using AJAX
/getData
Returns the data associated with the user in the browser (data stored inside redis)
/remUser
Removes the user from redis with all of the associated data
A point is represented as an array of integers/floats.
The file parser is a class called FileParser
. It is used to parse the files the user
uploads. The source code can be found inside routes/file-parser.js
.
The methods below are all public methods.
FileParser.parseData(filePath)
Parses the data file, *.dat
files,
and stores the result inside this.dat
variable.
Requires: the file path of the data file
Returns: this.dat
, contains the parsed data
FileParser.getSbp(filePath)
Filters the data using the subspace file, *.sbp
files, and stores the result inside this.sbp
variable.
Requires: the file path of the data file
Returns: this.sbp
, contains the filtered subspace
FileParser.getFts(filePath)
Parses the features file, *.fts
files, and stores the result inside this.fts
variable.
Requires: the file path of the data file
Returns: this.fts
, contains the features
FileParser.getTis(filePath)
Parses the tissue file, *.tis
files, and stores the result inside this.tis
variable.
Requires: the file path of the data file
Returns: this.tis
, contains the tissues
FileParser.getGrd(filePath)
Parses the grade file, *.grd
files, and stores the result inside this.grd
variable.
Requires: the file path of the data file
Returns: this.grd
, contains the tissues
FileParser.getInd(filePath)
Parses the indicator features file, *.ind
files, and stores the result inside this.ind
variable.
Requires: the file path of the data file
Returns: this.ind
, contains the tissues
FileParser.getTis(filePath)
Parses the tissue file, *.tis
files, and stores the result inside this.tis
variable.
Requires: the file path of the data file
Returns: this.tis
, contains the tissues
FileParser.parseSbp()
Parses the subspace depending on the features file and stores the result inside this.dat
variable.
Requires: none
Returns: this.dat
, which contains the correctly parsed data
this.dat
variable
There are more unsuppored methods that needs to be implemented.
// data to be parsed from uploaded raw data
this.dat = []; // data file
this.sbp = []; // subspace file
this.tis = []; // tissue file
this.fts = []; // feature file
this.grd = []; // grade file
this.ind = []; // indicator file
this.dimension = ''; // dimension of the uploaded data
/*********
* Variables below are unsuppored
********/
// data to be parsed from pre-computed data
// unzipped folder
this.unZipped = '';
// Kmeans data
this.centers = '';
this.dist2Clusters = '';
this.clusters = '';
// Tree data
this.tree = '';
// Projection data
this.projection = '';
Use the above methods in the following recommended sequence
var parser = new FileParser();
// parse the data
parser.parseData(filePathDat);
var sbp = parser.getSbp(filePathSbp);
var fts = parser.getFts(filePathFts);
var tis = parser.getTis(filePathTis);
var grd = parser.getGrd(filePathGrd);
var ind = parser.getInd(filePathInd);
var dim = parser.dimension;
// parses the subspace
var dat = parser.parseSbp();
// its safe to use the data now
The K-means algorithm is a class called kmeans
. The source code can be found inside routes/kmeans.js
.
The methods below are all public methods.
kmeans(data, numK, dim, tisNames, centers)
Runs the kmeans algorithm on the data
Requires:
data
: array of points
numK
: number of prototypes
dim
: dimension of data
tisNames
: array containing the tissue names
centers (optional)
: already computed centers instead of randomly generating prototypes (used when updating the kmeans)
Returns: An object of the following format
output: { "centers": [c1, c2,...,ck], // coordinates of the centers of the prototypes
"clusters": [ [pi, pi+1,...],...,[pi+4, pi+n,...] ], // 2D array containing the points closest to each cluster (which points belong to which cluster)
"dist2Clusters": [ [{cIndex: 1, dis: distance},...,{cIndex: k, dis: distance}],...], // 2D array containing the distances between each point and all clusters
"tisNames": [[tisX1, tisX2,...],...,[tisX45, tisX49]] // 2D array containing the tissue names of the points each cluster
}
The Competitive Hebbian Learning (CHL) is used to generate a tree/graph using the data generated from the K-means algorithm.
The CHL algorithm is a function called genTree
. The source code can be found inside routes/tree.js
genTree(distances, numK)
Generates an undirected graph or sometimes a tree
Requires:
distances
: 2D array containing the distances between each point and all clusters generated by the Kmeans algorithm
numK
: number of prototypes
dim
: dimension of data
Returns: An array of edges of the following format
output: [ {w: weight, n: [src, dest]},...]
w --> weight of the edge
src --> source node (int representing the number of the node)
dest --> destination node (int representing the number of the node)
The exporter is a class called Exporter
. It is used to export the data computed and stored inside redis. The source code can be found inside routes/exporter.js
.
CSVwriter
that is found inside router/file-writer.js
to output the files in a CSV format. The race condition occurs while writing the CSV files and trying to zip them together. As a result, the Exporter
will zip empty files because while it is trying to zip the output, CSVwriter
has not yet completed writing the files.
Exporter(user, writer)
Exports all the computed data and stores them inside tmp/user/
Requires:
user
: current username
writer
: writer to be used to export the data with (nnly writer available is CSVwriter
)
Returns: retuns nothing
There are two utility files which include simple functions that are used all over the project.
The first file is routes/matrix_utils.js
which has simple matrix utils functions that are used when projecting the data.
The second file is routes/redis-util.js
which has redis utils functions, they could be used for reference.