Education

  • Ph.D. 2016

    Ph.D. in Computer Science

    University of California, Davis

  • M.S. 2010

    Artificial Intelligence

    Sharif University of Technology

  • B.S. 2007

    Software Engineering

    Sharif University of Technology

Extracurricular Activities

  • Present 2013

    Co-founder and CFO of Razi Foundation

    I am proud to be a part of Razi Foundation, a charity created to help feed orphans and underprivileged people in African countries, mainly Ghana.

  • 2006 2005

    Member of the “Students Scientific Chapter”

    Elected as a member of the “Students Scientific Chapter” at Computer Engineering Dept., Sharif University of Technology

  • 2007 2004

    ACM ICPC Staff

    I was a staff at ACM ICPC, West Asia Regional Contest, held at Sharif University of Technology.

Honors, Awards and Grants

  • MSR 2017
    ACM SIGSOFT Distinguished Paper Award
    image
    My paper titled: "SSome From Here, Some From There: Cross-Project Code Reuse in GitHub" won the ACM SIGSOFT Distinguished Paper Award at the 14th International Conference on Mining Software Repositories.
  • 2015
    GGCS Graduate Program Fellowship
    image
    Awarded the Graduate Group in Computer Science (GGCS) Graduate Program Fellowship, at UC Davis, 2013.
  • ICSM 2013
    Nominee: ACM SIGSOFT Distinguished Paper Award
    image
    My paper titled: "Social Activities Rival Patch Submission for Prediction of Developer Initiation in OSS Projects" was nominated for the ACM Distinguished Paper Award at the 29th IEEE International Conference on Software Maintenance.
  • 2013
    GGCS Graduate Program Fellowship
    image
    Awarded the Graduate Group in Computer Science (GGCS) Graduate Program Fellowship, at UC Davis, 2013.
  • 2007
    2nd Place in the "Nationwide Graduate School Entrance Exam"
    image
    I ranked 2nd among more than 10,000 participants in the "Nationwide Graduate School Entrance Exam" in the field of Artificial Intelligence. I also ranked 8th in Software Engineering and 14th in Computer Architecture.
  • 2003
    Top 1% in the "Nationwide Undergraduate University Entrance Exam"
    image
    I ranked in the top 0.2% among more than 500,000 participants in the "Nationwide Undergraduate University Entrance Exam".

Research Colleagues

Prof. Hamid R. Rabiee

Colleague

Homepage

Prof. Hossein Asadi

Colleague

Homepage

Prof. Abbas Heydarnoori

Colleague

Homepage

Prof. Bogdan Vasilesu

Colleague

Homepage

Dale Fletter

Colleague

Homepage

Prof. Vladimir Filkov

Advisor

Homepage

Prof. Raissa D'Souza

Co-advisor

Homepage

Prof. Premkumar Devanbu

Colleague

Homepage

Dr. Daryl Posnett

Colleague

Homepage

Great lab mates!

I have been blessed with a great number of people who have helped me through my journey both at UC Davis (more specifically at DECAL), and at Sharif University of Technology. Professors, post-docs and students, they are all highly motivated and skilled and do top of the line research. Check them out!


Some from here, some from there: cross-project code reuse in GitHub

Mohammad Gharehyazie, Baishakhi Ray, Vladimir Filkov
Conference Paper The 14th International Conference on Mining Software Repositories, 20-29 May, 2017, Buenos Aires, Pages 291-301

Abstract

Code reuse has well-known benefits on code quality, coding efficiency, and maintenance. Open Source Software (OSS) programmers gladly share their own code and they happily reuse others'. Social programming platforms like GitHub have normalized code foraging via their common platforms, enabling code search and reuse across different projects. Removing project borders may facilitate more efficient code foraging, and consequently faster programming. But looking for code across projects takes longer and, once found, may be more challenging to tailor to one's needs. Learning how much code reuse goes on across projects, and identifying emerging patterns in past cross-project search behavior may help future foraging efforts.
To understand cross-project code reuse, here we present an in-depth study of cloning in GitHub. Using Deckard, a clone finding tool, we identified copies of code fragments across projects, and investigate their prevalence and characteristics using statistical and network science approaches, and with multiple case studies. By triangulating findings from different methods, we find that cross-project cloning is prevalent in GitHub, ranging from cloning few lines of code to whole project repositories. Some of the projects serve as popular sources of clones, and others seem to contain more clones than their fair share. Moreover, we find that ecosystem cloning follows an onion model: most clones come from the same project, then from projects in the same application domain, and finally from projects in different domains. Our results show directions for new tools that can facilitate code foraging and sharing within GitHub.

Tracing distributed collaborative development in apache software foundation projects

Mohammad Gharehyazie, Vladimir Filkov
Journal Paper Empirical Software Engineering, November 2016, Pages 1-36

Abstract

Developing and maintaining large software systems typically requires that developers collaborate on many tasks. During such collaborations, when multiple people work on the same chunk of code at the same time, they communicate with each other and employ safeguards in various ways. Recent studies have considered group co-development in OSS projects and found that it is an essential part of many projects. However, those studies were limited to groups of size two, i.e., pairs of developers. Here we go further and characterize co-development in larger groups. We develop an effective methodology for capturing distributed collaboration beyond groups of size two, based on synchronized commit activities among multiple developers, and apply it to data from 26 OSS projects from the Apache Software Foundation. We find that distributed collaborations is prevalent, but not as frequent as expected. We also find that while in distributed collaborative groups, developers’ behavior is different than when programming alone, e.g., high developer focus on specific code packages associates with lower team participation, while packages with higher ownership get less attention from groups than from individuals. Finally, we show that productivity effort during co-development is more often lower for developers while they co-develop in groups. To verify our results we use both quantitative and qualitative methods, including a developer survey. We conclude that these methods and results can be used to understand the effects of the collaborative dynamic in OSS teams on the software engineering process. Our code, along with our datasets and survey is available at http://www.gharehyazie.com/supplementary/teamwork/.

PlantPReS: A database for plant proteome response to stress

Seyed Ahmad Mousavi, Farhad Movahedi Pouya, Mohammad Reza Ghaffari, Mehdi Mirzaei, Akram Ghaffari, Mehdi Alikhani, Mohammad Gharehyazie, Setsuko Komatsu, Paul A Haynes, Ghasem Hosseini Salekdeh
Journal Paper Journal of Proteomics, June 2016, Pages 69-72

Abstract

About 75% of plant yield potential has been estimated to be lost to environmental stresses, even in developed agricultures. To facilitate the biotechnological improvement of crop productivity, genes and proteins that control crop adaptation to a wide range of environments will need to be identified. Due to the challenges faced in text/data mining, there is a large gap between the data available to researchers and the hundreds of published plant stress proteomics articles. Plant stress proteome database (PlantPReS; www.proteome.ir) is an open online proteomic database, which currently (as of October 2015) comprises > 20,413 entries from 456 manually curated articles, and contains > 10,600 unique stress responsive proteins. Since every aspect of the experiments, including protein name, accession number, plant type, tissue, stress types, organelles, and developmental stage has been digitized, experimental data can be rapidly accessed and integrated. Furthermore, PlantPReS enables researchers to perform multiple analyses on the database using the filtration mode, and the results of each query indicate a series of proteins for which a set of selected criteria is met. The query results can be displayed in either text or graphical format.

Expertise and Behavior of Unix Command Line Users: an Exploratory Study

Mohammad Gharehyazie, Bo Zhou, Iulian Neamtiu
Conference Paper ACHI 2016 : The Ninth International Conference on Advances in Computer-Human Interactions, 24 Apr., 2016, Venice, Pages 315-322

Abstract

Understanding users' behavioral patterns and quantifying users' expertise have a myriad applications, from predicting user actions and tailoring the environment to that specific user, to detecting masquerade attacks and assessing learning outcomes. Toward this end, we have conducted a study on three Unix command datasets, totaling 263 users and more than 1 million commands. We first introduce the notions of command expertise, command line expertise, and command category. Next, we use these metrics, combined with other attributes to define and quantify several user expertise metrics, e.g., category breadth, command line expertise. Our study has revealed many Unix commands characteristics, e.g., Unix command can be grouped into 25 categories; file management is the most common activity; the most commonly used commands are two-characters long. Our study has also revealed many insights into user expertise and behavior, such as: command line length is not an indicator of user expertise; users activity is highest on Monday and decreases every day through Saturday, picking up on Sunday; peak command usage hours are 11 a.m., 1 p.m. and 4 p.m.; development activities happen mostly in the afternoon.

Developer Initiation and Social Interactions in OSS: A Case Study of the Apache Software Foundation

Mohammad Gharehyazie, Daryl Posnett, Bogdan Vasilescu, Vladimir Filkov
Journal Paper Empirical Software Engineering, August 2014, Pages 1-36

Abstract

Maintaining a productive and collaborative team of developers is essential to Open Source Software (OSS) success, and hinges upon the trust inherent among the team. Whether a project participant is initiated as a committer is a function of both his technical contributions and also his social interactions with other project participants. One’s online social footprint is arguably easier to ascertain and gather than one’s technical contributions e.g., gathering patch submission information requires mining multiple sources with different formats, and then merging the aliases from these sources. In contrast to prior work, where patch submission was found to be an essential ingredient to achieving committer status, here we investigate the extent to which the likelihood of achieving that status can be modeled solely as a social network phenomenon. For 6 different Apache Software Foundation OSS projects we compile and integrate a set of social measures of the communications network among OSS project participants and a set of technical measures, i.e., OSS developers’ patch submission activities. We use these sets to predict whether a project participant will become a committer, and to characterize their socialization patterns around the time of becoming committer. We find that the social network metrics, in particular the amount of two-way communication a person participates in, are more significant predictors of one’s likelihood to becoming a committer. Further, we find that this is true to the extent that other predictors, e.g., patch submission info, need not be included in the models. In addition, we show that future committers are easy to identify with great fidelity when using the first three months of data of their social activities. Moreover, only the first month of their social links are a very useful predictor, coming within 10% of the three month data’s predictions. Interestingly, we find that on average, for each project, one’s level of socialization ramps up before the time of becoming a committer. After obtaining committer status, their social behavior is more individualized, falling into few distinct modes of behavior. In a significant number of projects, immediately after the initiation there is a notable social cooling-off period. Finally, we find that it is easier to become a developer earlier in the projects life cycle than it is later as the project matures. These results should provide insight on the social nature of gaining trust and advancing in status in distributed projects.

Social Activities Rival Patch Submission for Prediction of Developer Initiation in OSS Projects

Mohammad Gharehyazie, Daryl Posnett, Vladimir Filkov
Conference Paper Software Maintenance (ICSM), 2013 29th IEEE International Conference on, 22-28 Sept. 2013, Eindhoven, Pages 340-349

Abstract

Maintaining a productive and collaborative team of developers is essential to Open Source Software (OSS) success, and hinges upon the trust inherent among the team. Whether a project participant is initiated as a developer is a function of both his technical contributions and also his social interactions with other project participants. Oneâs online social footprint is arguably easier to ascertain and gather than oneâs technical contributions e.g., gathering patch submission information requires mining multiple sources with different formats, and then merging the aliases from these sources. In contrast to prior work, where patch submission was found to be an essential ingredient to achieving developer status, here we investigate the extent to which the likelihood of achieving that status can be modeled solely as a social network phenomenon. For 6 different OSS projects we compile and integrate a set of social measures of the communications network among OSS project participants and a set of technical measures, i.e. OSS developers patch submission activities. We use these sets to predict whether a project participant will become a developer. We find that the social network metrics, in particular the amount of two-way communication a person participates in, are more significant predictors of oneâs likelihood to becoming a developer. Further, we find that this is true to the extent that other predictors, e.g. patch submission info, need not be included in the models. In addition, we show that future developers are easy to identify with great fidelity when using the first three months of data of their social activities. Moreover, only the first month of their social links are a very useful predictor, coming within 10% of the three month dataâs predictions. Finally, we find that it is easier to become a developer earlier in the projects lifecycle than it is later as the project matures. These results should provide insight on the soci- l nature of gaining trust and advancing in status in distributed projects.

Measuring the Effect of Social Communications on Individual Working Rhythms: A Case Study of Open Source Software

Qi Xuan, Mohammad Gharehyazie, Premkumar T. Devanbu, Vladimir Filkov
Conference Paper Social Informatics (SocialInformatics), 2012 International Conference on, 14-16 Dec. 2012, Lausanne, Pages 78-85

Abstract

This paper proposes novel quantitative methods to measure the effects of social communications on individual working rhythms by analyzing the communication and code committing records in tens of Open Source Software (OSS) projects. Our methods are based on complex network and time-series analysis. We define the notion of a working rhythm as the average time spent on a commit task and we study the correlation between working rhythm and communication frequency. We build communication networks for code developers, and find that the developers with higher social status, represented by the nodes with larger number of outgoing or incoming links, always have faster working rhythms and thus contribute more per unit time to the projects. We also study the dependency between work (committing) and talk (communication) activities, in particular the effect of their interleaving. We introduce multi-activity time-series and quantitative measures based on activity latencies to evaluate this dependency. Comparison of simulated time-series with the real ones suggests that when work and talk activities are in proximity they may accelerate each other in OSS systems. These findings suggest that frequent communication before and after committing activities is essential for effective software development in distributed systems.

Professional Experience

  • 2017-Now
    Executive Vice President at IT Center
    image

    Managing daily operations

    Supervising the renovation and modernization of university's software and hardware infrastructure

    Leading the development of software systems and services aimed at improving organizational efficiency

  • 2016-Now
    Postdoctoral Researcher at ICTIC
    image

    Developing cloud based Data Analytics services

    Providing Mobile Analytics solutions to developers

  • Summer '14
    Research Intern
    image

    Lead a research project on the extraction of user expertise from command trace data

    Worked on several phases of the project such as data sanitization, statistical analysis, and visualization

  • 2004-14
    Web developer
    image

    I often make websites, sometimes for my friends and family, and sometimes as a job. It helps me improve my HTML, CSS, and JavaScript Skills.

    Examples include:

    • sotoodehyari.com and gharehyazie.com (this site).
    • Royan Proteomics website for “Royan Institute”
    • Iranian Proteomics Archives for “Agricultural and Biotechnological Research Institute of Iran”

  • 2008-10
    Systems Analyst
    image
    I was a Systems Analyst at the Secretariat of the Expediency Discernment Council (SEDC). My main task was to analyze the current state of computer systems and network at SEDC and to design and propose further improvements to increase productivity. During this time I also took part in the development of the SEDC IT Master Plan.
  • 2007-08
    Software System Developer
    image

    Designed and developed a customized bookstore retail POS & inventory software

    Due to the wide of variety of KFA’s demands, a unique and comprehensive POS and inventory system was needed for its 200+ sales branches spread across the country

    The system expedited handling customers’ requests, which resulted in more than 30% increase in sales when utilized at KFA booths in Iran’s International Book Fair 2008

Teaching Assistant

  • Winter 2016

    ECS 240 - Programming Languages

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Fall 2015

    ECS 222A - Advanced Algorithms

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Summer 2013

    ECS 20 - Discrete Mathematics

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Fall 2011

    ECS 154 - Computer Architecture

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Summer 2011

    ECS 40 - Introduction to Programming

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Spring 2011

    ECS 153 - Computer Security

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Fall 2009

    Neural Networks

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Fall 2009

    Game Theory

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Winter 2008

    Artificial Intelligence

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

  • Fall - Winter 2005-06

    English for Students of Computer Engineering

    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultrices ac elit sit amet porttitor. Suspendisse congue, erat vulputate pharetra mollis, est eros fermentum nibh, vitae rhoncus est arcu vitae elit.

At My Office

You can find me at my office located at IT Center at Sharif University.

I am at my office Sat. to Wed. from around 8:00 am until 7:00 pm, but you may consider an email to fix an appointment.

At My Home

Really? Is it that urgent?

My door is always open to guests ... if they can find the door!