Personal tools
You are here: Home Resources Science Alliance E-News 2010 October 2010 UCSD Computational Research Experience for Undergraduates (CREU) at the San Diego Supercomputer Center
Navigation
Log in


Forgot your password?
New user?
 

UCSD Computational Research Experience for Undergraduates (CREU) at the San Diego Supercomputer Center

UCSD Computational Research Experience for Undergraduates (CREU)

This year, the San Diego Supercomputer Center, at the University of California, San Diego, launches a unique volunteer internship program for undergraduate students entitled Computational Research Experience for Undergraduates (CREU). As part of this program, students will be paired with an SDSC mentor and will work as part of a research team dedicated to a particular area of computational research.

Please review the 18 research opportunities listed below and submit a separate application for each opportunity that may interest you. All applications must be submitted via US Mail and postmarked no later than November 27, 2010.

Please note: These research opportunities are open to any undergraduate student at any institution of higher learning. Opportunities will be posted with local colleges and universities that offer independent research credit. It is the responsibility of the student applicant to arrange credit with his/her college or university where arrangements by UCSD have not been made.             

"This is an opportunity to potentially be published, to potentially be hired and to discover a career path that you may not have known existed." - Robert Thompson, former SDSC intern and Sony Game Developer. 

San Diego Supercomputer Center, UC San Diego

January 10 – March 18, 2011 (ten weeks)

Internship hours will be coordinated with your mentor and can range from 10 - 15 hours per week.

Undergraduate Winter Research Opportunities:

Project Title: CloudStor: Data Intensive Computing on the Cloud

Project Overview

The objective of the CloudStor group is to explore new strategies and technologies for data-intensive cloud computing; investigate application profiles that benefit from this paradigm; and, develop corresponding applications. The CloudStor group is interested in evaluating the performance and price/performance of alternative, dynamic strategies for provisioning data intensive applications based on parallel database systems versus Hadoop. Our studies will contribute towards the understanding of performance tradeoffs and feasibility in various provisioning strategies for serving large scientific data sets. A possible outcome is a reassessment of how data archives are implemented and how data sets are served to a broad user community using on-demand and dynamic approaches for provisioning data sets, as opposed to the current static approach.

Project Mentor: Dr. Chaitanya Baru, San Diego Supercomputer Center, UCSD

 

Project Title: Renewable Energy / Microgrid System Analysis Project

Project Overview

UCSD is an owner-operator of a 45 MW peak load microgrid with multiple renewable and non-renewable energy generation resources, significant energy storage, and a sophisticated monitoring for controlling flex-demand loads. UCSD seeks to elevate its Smart Grid to a level of global excellence as a holistic, planned community that balances energy infrastructure with that of biodiversity, transportation, water efficiency, waste stream management, Green House Gas (GHG) reduction, telecommunications, security and disaster preparedness. The fundamental goal is to advance the understanding of the complex dynamics that drive community-scale, end-use energy demand, and associated local and global air emissions; to apply this knowledge to generate planning methods and community design models and municipal processes enabling practitioners to build energy-efficient, low- carbon development projects; and to resolve market barriers and risks impeding integration of energy- efficient technologies into development projects through energy-development industry collaboration.

Project Mentor: Dr. Natasha Balac, San Diego Supercomputer Center and Dr. Byron Washom, Director, Strategic Energy Initiatives, UCSD

 

Project Title: Provocateur

Project Overview

Researching the use of shared databases on the web for public recommendation and brainstorming system. Students will conduct initial research into public linked data sets available on the web and learn how to form and specify queries in SPARQL and other semantic web languages for retrieval of relevant documents for a group discussion.

Project Mentor: Dr. Natasha Balac, San Diego Supercomputer Center and Dr. Shlomo Dubnov, Assistant Professor, Music , UCSD

 

Project Title: Narrative and Emotional Structure Discovery through Machine Learning Applied to a Large Film and Television Script Database

Project Overview

Very large humanities data sets and high performance computing create unique opportunities for multimedia data analytics. Film scripts are a semi-formalized way of representing a story. Recently the importance of story in organization communication and learning has been largely celebrated and many methods are used to introduce more sophisticated structures and additional information into traditional stories. Access to temporal information raises the need to model and associate complex information over time in a manner that is both intuitive, succinct to visualization, and at the same time precise enough to capture structures that are meaningful for a specific domain, genre, style, or expertise. Moreover, contents originating in one medium—cinema, television, game, online—film and drama series are increasingly migrated to another, with audio and video clips being reused, remixed and recombined to create alternative meanings. Analysis of narrative in text and media requires establishing a web of inter-relations between multi-varied and mixed numeric and symbolic data whose evolution need to be traced over time. Promising direction in this respect is one of modeling semantics via geometry and topology of information. The starting point is embedding media objects, such as words in a scene of a film script or features in a segment of video or audio stream, in a metric space.

Project Mentor: Dr. Natasha Balac, San Diego Supercomputer Center and Dr. Shlomo Dubnov, Assistant Professor, Music, UCSD

 

Project Title: NEURON Optimization and Validation Study

Project Overview

NEURON is the gold standard in simulation environments for modeling individual neurons and networks of neurons. It provides tools for conveniently building, managing, and using models in a way that is numerically sound and computationally efficient. We plan to study the rules and strategies for taking serial models to naive parallel models and optimize the parallel models. We also plan to study model validation. Work involved for the student may involve working with various neuroscience models, helping with the runs , analyzing the output or coming up with validation techniques. The student participant will be mentioned in any publication based on the outcome of the research. We encourage students working in the field of computer science or computational biology to apply to work in this research.

Project Mentor: Subhashini Sivagnanam and Kenneth Yoshimoto, San Diego Supercomputer Center, UCSD

 

Project Title: Evaluating Data Mining Tools for Large Memory, High Performance Computing

Project Overview

Increasingly, scientific breakthroughs will depend on advanced computing capabilities that help researchers manipulate and explore massive datasets. The trouble is that science and society produces so much data - internet documents, brain images, climate measurement, etc. – which often needs to be analyzed as one big interdependent mass instead of small local pieces. SDSC is currently deploying special large, flash-storage, parallel computer architectures that can handle such data-intensive analysis. The research activity will involve working with the PIs to find a large scientific data set that exists in the public domain and to understand the basic kinds of questions that are addressed with such data. Then the student will apply the data mining algorithms to the scientific data. This application will address basic issues such as how much data to use, what kind of preprocessing is necessary in a HPC environment (over and above a non-HPC one), and what kind of performance results can be delivered over some range of HPC parameters and algorithm parameters.

Project Mentor: Dr. Paul Rodriguez and Dr. Nicole Wolter, San Diego Supercomputer Center, UCSD

 

Project Title: User Behavior Study for Schedulers

Project Overview

Understanding the behavior of user base will allow the design of better job schedulers. In previous studies, it has been shown that the user behavior correlates best with the response time of their jobs. However the decision was based on examining traces containing raw data on jobs that were submitted to real, production-use parallel systems. We plan to explore a different angle by using data mining tools to understand the pattern behind user behavior. Work involved for the student may include data gathering, working with data mining tools to analyze data and help with the simulations. The student participant will also be mentioned in any publication based on the outcome of the research. We encourage students working in the field of computer science or engineering to apply to work in this research.

Project Mentor: Subhashini Sivagnanam and Kenneth Yoshimoto, San Diego Supercomputer Center, UCSD

 

Project Title: New QM/MM Approaches for High Performance Molecular Dynamics Simulations of Condensed Phase Biological Systems

Project Overview

This CREU opportunity is set up to be an integral part of our ongoing effort to develop an extensible interface for mixed quantum mechanical/molecular mechanical (QM/MM) molecular dynamics (MD) simulations that combine the AMBER (http://www.ambermd.org) MD software engine with quantum chemistry programs like ADF (http://www.scm.com), GAMESS (http://www.msg.chem.iastate.edu/gamess/) and TeraChem (http://www.petachem.com). The availability of such an interface improves the accuracy and range of applicability of MD simulations that are available to researchers that are using the AMBER MD software package. The CREU intern will contribute at various stages during the project including software development, testing of the implementation and validation calculations on local compute resources including GPU clusters, the Triton compute resource (http://tritonresource.sdsc.edu/) and a variety of TeraGrid (http://www.teragrid.org) supercomputing platforms.

Project Mentor: Professor Ross C. Walker and Dr. Andreas W. Goetz, San Diego Supercomputer Center, UCSD

 

Project Title: OpenTopography: A Portal to High Resolution Topography Data & Tools

Project Overview

We seek CREU students to work as part of the National Science Foundation-funded OpenTopography Facility (http://www.opentopography.org) hosted at the San Diego Supercomputer Center (SDSC). OpenTopography has developed a cyberinfrastructure- based solution to enable online access to Earth science-oriented high-resolution LIDAR topography data, online processing tools, and derivative products. High-resolution topography data acquired with LIDAR (Light Detection and Ranging) remote sensing technology have emerged as a fundamental tool for Earth science research. Because these acquisitions are often undertaken with federal and state funds at significant cost, it is important to maximize the impact of the data by providing online access to a range of potential users. Leveraging high performance computational and data storage resources available at SDSC, OpenTopography provides access to terabytes of point cloud data, standard digital elevation models, and Google Earth image data, all co-located with computational resources for higher-level data processing.

Project Mentor: Dr. Chaitanya Baru and Dr. Christopher Crosby, San Diego Supercomputer Center, UCSD

 

Project Title: SCAlable National Network for Effectiveness Research (SCANNER)

Project Overview

The SCANNER project will develop a distributed network infrastructure for comparative effectiveness research that provides participating sites the means for flexibility in data sharing. This flexibility will be implemented by allowing codification of data sharing policies, where each institution will specify its own policies. SCANNER will connect diverse healthcare delivery settings with secure infrastructure that utilizes data collected at the point of care. Policies for data sharing will range from sharing of de-identified records to sharing aggregate results. The network will have a main node that manages policies, distributes queries, aggregates results, and maintains trust and security (authentication, authorization, auditing, etc). Each site will maintain a node that contains data from that site. The network will support retrospective analyses, prospective observational studies, clinical trials, and feedback to point-of-care users. Near real-time collection, analysis, and dissemination of results and feedback to the clinician will be enabled by an infrastructure that allows data to be exchanged according to policies specified by individuals and institutions.

Project Mentor: Dr. Natasha Balac and Dr. Michelle Day, San Diego Supercomputer Center, UCSD

 

Project Title: 3D Modeling and Game Development

Project Overview

This project seeks to recreate the segregated society and cultural activities at two Japanese American internment camps during World War II in an interactive gaming environment. The interdisciplinary research will involve students from Computer Science, Communication and Theatre and Dance. The study will develop the Jerome and Rohwer campsites in southeast Arkansas. This study would consist of game level and game play design using Torque game engine as well as 3D modeling using Maya software. The students will learn and apply game development and 3d modeling in this project.

Project Mentor: Dr. Amit Chourasia, Lead, VisServices, San Diego Supercomputer Center, and Dr. Emily Roxworthy, Department of Theater and Dance, UCSD

 

Project Title: iDASH: integrating Data for analysis, Anonymization, and Sharing

Project Overview

The iDASH project will create a national center for biomedical computing that develops new algorithms, open-source tools, and computational infrastructure and services that enable biomedical and behavioral researchers to integrate Data for analysis, Anonymization, and Sharing. A multitude of research questions exist that cannot be addressed adequately by viewing data from a single healthcare institution, and concerns about privacy when sharing data add a significant barrier to research progress. iDASH will address this fundamental challenge by providing a secure, privacy‐preserving environment in which researchers can analyze genomic, transcriptomic, and highly annotated phenotypic data. iDASH will focus on privacy protection through anonymization, data simulation, and an informed consent management system. It will concentrate on data analysis through the development of new tools for data annotation and integration across temporal and spatial dimensions and develop algorithms for rare event detection and risk adjustment. This project is a starting point for the development of new tools that will advance three biologically-based projects which span the molecular-individual-population spectrum: (1) molecular phenotyping of Kawasaki Disease, (2) surveillance of anticoagulation agents, and (3) individualized intervention to enhance physical activity.

Project Mentor: Dr. Michelle Day, San Diego Supercomputer Center, UCSD

 

Project Title: Processing Large Images on Triton

Project Overview

This project is creating a mechanism and software for processing large biomedical images on a new SDSC supercomputer. Efficient processing of large imagery, including 3D reconstruction, tile stitching, deconvolution, mosaicing, spatial registration and warping, and online visualization, is compute intensive and should rely on advanced high performance computing systems. Yet the demand for such imagery available online has become ubiquitous. This project will define workflows for efficient image processing for three types of images derived from confocal laser scanning light microscopes (LMs), intermediate voltage electron microscopes (IVEMs), and from serial block-face scanning electron microscope (SEM). The main challenges of managing such imagery include extremely large sizes of images, image collections and volumes (several TByte for individual images or mosaics), absence of spatial registration and alignment, compute-intensive procedures for image-volume reconstruction, stitching and warping, and managing image metadata. In this project, we will experiment with different techniques to address these challenges, and explore memory configurations suitable for image processing.

Project Mentor: Dr. Ilya Zaslavsky and Dr. David Valentine, San Diego Supercomputer Center, UCSD

 

Project Title: Quality Assurance for an Opensource Hydrologic Information System Desktop Client

Project Overview

The CUAHSI Hydrologic Information System project is developing a desktop client application called HydroDesktop. Non-computer science graduate students have developed the client. This project involves refactoring of the HydroDesktop application and a number of plugins, which are written using Microsoft .Net technologies. The goal is to improve the stability of this open source application, ensure that it can be tested, and provide a number of functional improvements. This will require development of a software quality policy and demonstrating implementations using unit tests in the Nunit framework. Working towards providing stable releases on Windows and Mono is the goal.

Project Mentor: Dr. Ilya Zaslavsky and Dr. David Valentine, San Diego Supercomputer Center, UCSD

 

Project Title: Concept Visualization and Editing in Semantic Wiki

Project Overview

This is a component of a large project that creates cyberinfrastructure for hydrologic sciences. In order to search hundreds of distributed hydrologic data sources, one has to annotate them and associate names of measured variables (there are tens of thousands of such variables) with a community-adopted list of keywords. These keywords (several thousands of them) form a hierarchy. Hydrology experts are planning to use a new technology called “semantic wiki” to develop and maintain a hierarchy of the common terms. In this project, we will integrate semantic wiki software with a hyperbolic tree visualization tool. At the moment, we can use a Startree visualization tool to help users navigate the semantic wiki. However, we do not yet update the visualization of concepts when changes are made in the semantic wiki. The programmer in this project will work with us on developing semantic wiki and Startree code to make community management and navigation of variable semantic easier.

Project Mentor: Dr. Ilya Zaslavsky and Dr. David Valentine, San Diego Supercomputer Center, UCSD

 

Project Title: HTML5 Dashboard for Kepler Scientific Workflow Design and Execution

Project Overview

A scientific workflow is the process of combining data and processes into a configurable, structured set of steps that implement semi-automated computational solutions of a scientific problem. The Kepler scientific workflow system (See http://kepler-project.org/) is developed by a cross-project collaboration to serve scientists from different disciplines, and is currently coordinated by the Kepler/CORE project. Since 2003, over twenty diverse projects encompassing multiple disciplines have used Kepler to manage, process and analyze this increasing amount of scientific data. Inherited from Ptolemy II (See: http://ptolemy.eecs.berkeley.edu/ptolemyII/), Kepler adopts the actor-oriented modeling paradigm for scientific workflow design and execution. Working on Kepler provides students will gain a unique research experience that includes practical training in real-world tools and best practices (e.g. CVS repository, ant, bugzilla, code reviews, ...) as well as the latest developments in cyberinfrastucture research (e.g. web service composition, distributed execution, and provenance tracking.

Project Mentor: Dr. Ilkay Altintas, San Diego Supercomputer Center, UCSD

 

Project Title: Development of Scripts for Configuring Hadoop Environment in Computer Clusters

Project Overview

The MapReduce is a parallel and scalable programming model for data-intensive computing, where input data is automatically partitioned onto multiple nodes and user programs are distributed and executed in parallel on the partitioned data blocks. It consists of two functions: Map function processes on a portion of the whole data set and produces a set of intermediate key-value pairs, and Reduce function accepts and merges the intermediate pairs generated from Map. MapReduce supports data partitioning, scheduling, load balancing, and fault tolerance. Following the simple interface of MapReduce, programmers can easily implement parallel applications. The Hadoop system provides an open source implementation of MapReduce, and has been widely used in many scientific and business systems. Hadoop is composed of a MapReduce runtime system and a distributed file system, called HDFS. HDFS supports MapReduce execution with the capability of automatic data redundancy and diffusion among each node in the Hadoop cluster. Hadoop also handles node failures automatically. One Hadoop node, called master, dispatches tasks and manages the executions of the other Hadoop nodes, i.e., slaves. Hadoop can be deployed in computer cluster to manage many CPU processors. However, creating Hadoop environment in a computer cluster, such as Triton resource at SDSC, is usually done manually. It has several steps, including interact with cluster scheduler to get needed resources, changing Hadoop configuration based on available resource information, copy Hadoop into each resource.

Project Mentor: Dr. Ilkay Altintas, San Diego Supercomputer Center, UCSD

 

Project Title: HPWREN – High Performance Wireless Research Education Network – Smartphone Sensor Research Project 

Project Overview

HPWREN is interested in involving one or undergraduate students in applications development for Smartphones (basically "wearable computers"). Of most interest is the open Android platform (e.g., a Motorola Droid). Examples are graphical displays of sensor data, such as from deployed meteorological stations. To view what HPWREN has in mind, please look at a met station on Mount Woodson. A web interface can be found at http://hpwren.ucsd.edu/Sensors/MtWoodson-WXT520/. An example of a graphically appealing real-time data application on an Android phone is "tricorder." HPWREN is researching if something similar could be used to display sensor data, perhaps even issue phone-local alerts if data gets out of bounds (e.g., a high wind alarm to a firefighter).

Project Mentor: Hans-Werner Braun, San Diego Supercomputer Center, UCSD 

 

The Application Process – How to Apply

Once you have reviewed the projects listed above and taken care to consider all of the prerequisites, then it is time to complete the application form. Remember, if you are applying for more than one internship, you will need to submit a separate application for each one. Multiple applications may be submitted in the same envelope.

CLICK HERE to download the Computational Research Experience for Undergraduates (CREU) Application Form.    

Selected applicants will be contacted by SDSC personnel to arrange a personal interview no later than December 10, 2010.

Questions?

If you have any questions about the application process, please contact Ange Mason, SDSC Education and Outreach, via phone at (858) 534-5064 or email at amason@ucsd.edu.

 

Document Actions
« May 2012 »
May
SuMoTuWeThFrSa
12345
6789101112
13141516171819
20212223242526
2728293031
 
Sections