Dr. George Yu

Associate Professor
Dr. George Yu - profile photo

Dr. George Yu

Associate Professor

Computer Science, Information, & Engineering Technology

Meshel Hall 316

phone: (330) 941-1775

fyu@ysu.edu

fyu.people.ysu.edu

Bio

I am an Associate Professor of Computer Science and Information Systems at Youngstown State University in Youngstown, Ohio. My Ph.D. of Computer Science was awarded at Southern Illinois University, Carbondale, IL in 2013. I also work as the Campus Champion of NSF Extreme Science and Engineering Discovery Environment (XSEDE) at YSU.

Research Interests

Big Data, Cloud Computing, Data Management, Data Science

Teaching Interests

Big Data, Cloud Computing, Data Management, Data Science

  • Education
    • 2013

      Ph D, Computer Science

      Southern Illinois University

  • Intellectual Contributions
    • 2023

      "Systematic collection and analysis of alternative splicing events in potato plants"

      Journal of Plant Sciences, volume 11, issue 3, p. 98-106

    • 2022

      "Non-Parametric Error Estimation for $\sigma$-AQP using Optimized Bootstrap Sampling."

      International Journal for Computers \& Their Applications, volume 29, issue 1

    • 2022

      "Identification and Analysis of Alternative Splicing in Soybean Plants"

      , volume 83, p. 1--9

    • 2021

      "Optimized Bootstrap Sampling for $\sigma$-AQP Error Estimation: A Pilot Study"

      Proceedings of ISCA 30th International Confer, volume 77, p. 144--153

    • 2020

      "Scalable Correlated Sampling for Join Query Estimations on Big Data (journal edition)"

      International Journal of Computers and Their Applications, volume 27, issue 1, p. 14--23

    • 2019

      "CS*: Approximate Query Processing on Big Data using Scalable Join Correlated Sample Synopsis"

      2019 IEEE International Conference on Big Data (Big Data), p. 583-592

    • 2019

      "Scalable Correlated Sampling for Join Query Estimations on Big Data"

      , volume 64, p. 41--50

    • 2019

      "Expanding alternative splicing identification by integrating multiple sources of transcription data in tomato"

      Frontiers in Plant Science, volume 10

    • 2018

      "OB-tree: a new write optimisation index on out-of-core column-store databases (journal edition)"

      International Journal of Intelligent Information and Database Systems, volume 11, issue 1, p. 46--66

    • 2018

      "Co-membership, networks ties, and OSS success: An investigation controlling for alternative mechanisms for knowledge flow"

    • 2018

      "An Evaluation of the Performance of Join Core and Join Indices Query Processing Methods"

      International Journal of Computers and Their Applications, volume 25, issue 3, p. 123--131

    • 2017

      "Storing Join Relationships for Fast Join Query Processing"

      , p. 167--177

    • 2017

      "OB-Tree: Accelerating Data Cleaning in Out-of-Core Column-Store Databases"

      Proceedings - 2017 IEEE 6th International Congress on Big Data, BigData Congress 2017

    • 2017

      "Fast processing of join queries with instant response"

      Proceedings of Computing Conference 2017, volume 2018-Janua

    • 2017

      "Comparative landscape of alternative splicing in fruit plants"

      Current plant biology, volume 9, p. 29--36

    • 2017

      "An efficient data structure for fast join query processing"

      , p. 483--492

    • 2016

      "ProtSecKB: The protist secretome and subcellular proteome knowledgebase"

      Computational Molecular Biology, volume 6

    • 2016

      "Data cleaning in out-of-core column-store databases: An index-based approach"

      , p. 16

    • 2016

      "An Aggressive Concurrency Control Protocol For Main Memory Databases"

      International Journal of Computer Applications, volume 155, issue 2, p. 7

    • 2016

      "A Prudent-Precedence Concurrency Control Protocol for High Data Contention Main Memory Databases"

      , p. 3

    • 2016

      "A Prudent-Precedence Concurrency Control Protocol for High Data Contention Database Enviornments"

      arXiv preprint arXiv:1611.05557

    • 2016

      "A hierarchical precedence concurrency control protocol for high data contention database environments"

      29th International Conference on Computer Applications in Industry and Engineering, CAINE 2016

    • 2015

      "Write Optimization Using Asynchronous Update on Out-of-Core Column-Store Databases in Map-Reduce"

      Proceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015

    • 2015

      "Prediction of plant protein subcellular locations"

      Proceedings of the 7th International Conference on Bioinformatics and Computational Biology, BICOB 2015

    • 2015

      "Online Data Cleaning for Out-Of-Core Column-Store Databases with Timestamped Binary Association Tables"

      , p. 407--412

    • 2015

      "Hastening data retrieval on out-of-core column-store databases using offset b+-tree"

      Proc. CAINE'15, p. 313--318

    • 2015

      "Genome-wide cataloging and analysis of alternatively spliced genes in cereal crops"

      BMC Genomics, volume 16, issue 1

    • 2015

      "An Asynchronous Method for Write Optimization of Column-Store Databases in Map-Reduce"

      International Journal of Big Data, volume 2, issue 4, p. 1--9

    • 2015

      "A framework of write optimization on read-optimized out-of-core column-store databases"

      , p. 155--169

    • 2014

      "Sample Trace: Deriving Fast Approximation for Repetitive Queries"

      DBKDA 2014, p. 67

    • 2014

      "Histogram and sample revisited: Estimation error analysis for multidimensional range selection queries"

    • 2014

      "Estimation Error Analysis of Range Selection Queries using Histogram and Sample in Low Dimensional Spaces"

      International Journal of Computers and Their Applications, volume 21, issue 3, p. 151--159

    • 2014

      "Containment Join in the Presence of Nested Data Nodes"

      , p. 221--226

    • 2014

      "Asynchronous Update on Out-of-Core Column-Store Databases Utilizing the Time stamped Binary Association Table"

      , p. 215--220

    • 2014

      "A Design of Heterogeneous Cloud Infrastructure for Big Data and Cloud Computing Services"

      Open Journal of Mobile Computing and Cloud Computing, volume 1, issue 2, p. 1--16

    • 2013

      "Sufficient statistics for re-optimizing repetitive queries (journal edition)"

      International Journal of Computers and their Applications, volume 20, issue 1

    • 2013

      "CS2: a new database synopsis for query estimation"

      , p. 469--480

    • 2013

      "Constructing Accurate Synopses for Database Query Optimization and Re-optimization"

      Southern Illinois University Carbondale

    • 2013

      "A Prudent-precedence Concurrency Control Protocol for High Data Contention Database Environments"

      , p. 289--294

    • 2012

      "Using Cached Results to Expedite XML Query Evaluations"

      , p. 81--85

    • 2012

      "Sufficient statistics for re-optimizing repetitive queries"

      Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012

    • 2012

      "Network intrusion detection types and computation"

      International journal of computer science and information security, volume 10, issue 1, p. 14--21

    • 2012

      "A Framework for Re-optimizing Repetitive Queries"

      , p. 72--78

    • 2011

      "Range-Sum Queries over High Dimensional Data Cubes Using a Dynamic Grid File"

      IGI Global

    • 2011

      "Join selectivity re-estimation for repetitive queries in databases"

      , p. 420--427

    • 2011

      "IRTA: An improved threshold algorithm for reverse top-k queries"

      ICEIS 2011 - Proceedings of the 13th International Conference on Enterprise Information Systems, volume 1 DISI

    • 2010

      "Selectivity Re-estimation for Repetitive Queries in Databases"

      , p. 12--18

    • 2010

      "Approximate clustering on data streams using discrete cosine transform"

      Journal of Information Processing Systems, volume 6, issue 1, p. 67--78

    • 2009

      "Enhancing SQL with Set-Comparison Operators."

      , p. 51--56

    • 2009

      "A sampling approach for XML query selectivity estimation"

      , p. 335--344

headshot

Bio

Dr. Yu completed the Ph.D. degree in Computer Science at Southern Illinois University, Carbondale, IL in May 2013. His Ph.D. thesis focused on query processing and optimization in database management systems. He received an M.S. degree in Mathematics from Shandong University, Jinan, China in 2008 and a B.S. degree in Information and Computational Science from Northeastern University, Shenyang, China in 2005.

Dr. Yu has published many papers in high-quality journals and conferences such as the International Journal of Big Data, BMC genomics, IEEE Big Data, SIGMOD, DEXA, etc. His paper was awarded the Best Paper Award of the International Conference of Software Engineering and Data Engineering (SEDE) in 2019. He serves as a reviewer and an associate editor for scholarly journals and a committee member of international conferences. His research was supported by external funding sources including Amazon Inc. and the Computer Research Association.

Dr. Yu serves as the Campus Champion of NSF Extreme Science and Engineering Discovery Environment (XSEDE) at YSU. His services include helping YSU researchers to apply for computing grants in XSEDE and use its vast supercomputing resources. In addition, he has been collaborating with XSEDE and Pittsburgh Supercomputing Center (PSC) to bring national workshop series on High-Performance Computing (HPC) to YSU since 2014.

Research Lab

YSU Data Lab is a research lab focusing on the research of data-oriented sciences. Dr. Yu and his students work on multiple cutting-edge research projects in this lab. This lab is also operating multiple high-performance research clouds including Sarah Cloud and YSU STEM Cloud.

Research Interests

  • Database Management Systems
    • Query Processing and Query Optimization
    • Semi-Structured Databases
  • Big Data Management and Analytics
    • Approximate Query Processing
    • NoSQL Databases
  • Data Science
    • Bioinformatics
    • Analytics of Social and Business Data
  • Cloud Computing
    • Cloud-Based Data Management
    • Hybrid Cloud Infrastructure


Personal Website 
Google Scholar 
Data Lab Website