Optimization has played a significant role in many areas such as engineering, sciences, health care. My research is mainly motivated by ‘‘big data’’ problems arising from image processing, machine learning, statistics, finance, and so on. The problems include data reconstruction, data reduction, and data mining.
Compressed sensing and low-rank matrix / tensor recovery
For large or huge-scale data, fully acquiring it can be very expensive (e.g., MRI), and to save acquisition time, the data is often partially sampled, or its very few measurements are taken. In some applications (e.g., movie-user rating by Netflix), it is even impossible to acquire the complete data. However, due to special structures (e.g., sparseness, smoothness, and low-rankness) of the data, it can be reliably reconstructed from its incomplete observations or under-determined measurements.
Regularized matrix and tensor factorization
Even if a large or huge amount of data is completely acquired, storing all of it is often very expensive and may be wasteful due to possible data redundancy (e.g., face recognition). Data reduction is usually necessary to remove redundancy and maintain principal information. Matrix and tensor factorizations with regularization terms (e.g., nonnegativity, sparsity, orthogonality) are efficient ways for dimensionality reduction and feature extraction.
If the data comes as a stream (e.g., in stochastic programs and online learning), and one wants to learn or extract important features from the data stream, storing all the data and then performing data reduction or mining may be impossible. At any time point, only partial samples of the data can be accessed. Even if one can wait until arrival of all the data and store it, accessing all the data for each update of the variables can be extremely expensive, and thus sampling small amount of the data is still beneficial and more efficient