Internship with Benoit Baudry at KTH
Adaptation of Amplified Unit Tests for Human Comprehension

Table of Contents

1 Introduction

2 Findings

2.1 Bibliography

2.1.1 Writing

  1. Context
  2. Problem statement

    Amplified tests have many new assertions and new inputs and DSpot keeps tests depending on the fact that they detect new mutants. What that means is that there is no enquiry on the usefulness of each amplification. Because the number of new mutants killed is often times significantly lower than the number of amplifications — especially the number of assertions which are the one to effectively detect a mutant — we end up with a lot of useless statements. This noise is problematic because it makes the review process longer, and because the less focused a program is, the harder it is to comprehend. And noise is not the only threat to focus as the new mutants can be completely different and from a different part of the SUT compared to the ones the original test case could detect. The final step towards a human-friendly output would be to add context and human-like description of the amplification. To sum-up, the three identified problems to tackle are noise reduction, focus confinement, and natural language description of the amplification.

  3. Technical Difficulties

    In order to describe a mutant, you need information about it. You could give the mutator "category" but you only have its class name. You could give the column to the statement it was applied to to highlight it, maybe, but you do not have access to that. You could use the position in the AST, but you do not know that.

    Knowing the which assertion killed what mutant is essential. Be it to start a program slice from that assertion, or simply paraphrasing the assertion to explain what bug is detected. But you do not know that. And you barely know, with ugly comments, what assertions are the result of amplifications. Identifying, afterwards, the role that assertions play is cumbersome. You can run the test case with every mutants. But first, you do not have directly access to mutants. And what do you want to do with them? Instrument, by adding a probe after each assertion? How do you automate elegantly such stream of test execution? Maybe you can remove assertions, one by one, and see if mutants keep getting killed? I know we are in SBSE but that is quite ugly.

    As said before we have no direct information on the position of amplifications in the new test case. Makes it harder to generate descriptions or apply minimization on them. But what data structure would you use? Bookkeeping of the position in the AST? How would you keep it up-to-date with multiple rounds of amplifications?

  4. On the usefulness of works from (code maintenance|software artifacts summarization)

    A lot of effort has been put in generating human friendly descriptions for various kinds of software artifacts. In particular, there have been works on generating documentation (or comments or description or summary) for source code, methods, and more interestingly unit test cases. These tools can generate natural language description of what the piece of code does and identify to most important parts or lines of code. But as for why a code change was done or the role a piece of code plays — i.e. understand the intentions of the developers — it is harder and tools need additional information or limit the scope by identifying stereotypes (e.g. labelling a commit as Feature Addition).

    But those works are not directly useful for our problem. First, we know why an amplified test was kept, it is because it can detect a new bug.

  5. On the usefulness of works from test cases minimization

    Using delta-diff we can identify useless statement and then remove them. But more powerful program minimisation tools are available. While we could predict that the more minimisation is applied, the less code there is left to describe thus the description is easier to generate, it is not obvious and others details have to be taken into account.

    First we might not want to modify the original part, as the developer might already be familiar with it and it might be less overwhelming to grasp the purpose of the test case. And even if the developer has never seen this test before, a hand-written program is probably easier to understand than a compact version.

    And tools probably cannot be told not to touch certain parts.

  6. On the usefulness of an NLG

    The sentences should always be the same, follow the same structure, built with the same template as humans do.

  7. What do we then propose as contribution

2.1.2 References

  1. Cultural
    • Search Based Software Engineering: Techniques, Taxonomy, Tutorial (harman2012search)
      • TODO
    • A Few Billion Lines of Code Later (bessey2010few)
      • Great to understand the limits of static analysis but also some of the limits of all analysis
      • Difficult to analyze code because of the diversity of build automation tools
      • "By default, companies refuse to let an external force modify anything."
      • "A misunderstood explanation means the error is ignored or, worse, transmuted into a false positive."
      • Many standards
      • Some people don't care about bugs, sometimes improving the tool reveals more bugs which is bad for the manager
    • Lessons from Building Static Analysis Tools at Google (sadowski2018lessons) Tricorder: Building a Program Analysis Ecosystem (sadowski2015tricorder)
      • Great to understand the challenges in pushing an analysis tool in the real world
      • notes on the printed paper
      • such tool need to be
        1. Integrated/Easy to use
        2. Free of false positive
        3. Easy to understand
    • Spoon: A Library for Implementing Analyses and Transformations of Java Source Code (pawlak2016spoon)
      • let's say it's like llvm/clang
    • Regression Testing Minimisation, Selection and Prioritisation : A Survey (yoo2012regression)
      • TODO
    • Clustering Test Cases to Achieve Effective & Scalable Prioritisation Incorporating Expert Knowledge (yoo2009clustering)
      • TODO
    • Measuring software redundancy (carzaniga2015measuring)
      • TODO
    • Automatic Software Diversity in the Light of Test Suites (baudry2015automatic)
      • analysis of common features (e.g. number of tests covering one statement)
      • plastic behavior (have different behaviors while still remaining correct) study
      • different details compared to 4 and 9
    • Tailored source code transformations to synthesize computationally diverse program variants (baudry2014tailored)
      • More details than in 4
    • Selecting a Software Engineering Tool: Lessons Learnt from Mutation Analysis (delahaye2015selecting)
      • TODO
    • The Oracle Problem in Software Testing: A Survey (barr2015oracle)
      • TODO
    • Human-Centered Design Meets Cognitive Load Theory: Designing Interfaces that Help People Think (oviatt2006human)
    • Grounded Theory in Software Engineering Research: A Critical Review and Guidelines (stol2016grounded)
    • Five Misunderstandings about Case-study Research (flyvbjerg2006five)
      • TODO
    • Is Search-based Unit Test Generation Research Stuck in a Local Optimum? 🌟🌟 (rojas2017search)
      • list of challenges
        1. Searching Flat Fitness Landscapes
        2. Achieving High Code Coverage
        3. Tests Without Oracles
        4. Ugly Generated Test Code
        5. Research Papers Instead of Usable Tools
  2. Unit Testing
    • A Survey of Unit Testing Practices (runeson2006survey)
      • TODO
    • WAP: Cognitive aspects in unit testing: The hunting game and the hunter's perspective 🌟🌟 (prado2015wap)
      • TODO
      • 1st in Prado's trilogy
    • Advances in the Characterization of Cognitive Support for Unit Testing: The Bug-Hunting Game and the Visualization Arsenal 🌟🌟 (prado2016advances)
      • TODO
      • 2nd in Prado's trilogy
    • Towards cognitive support for unit testing: a qualitative study with practitioners 🌟🌟🌟 (prado2018towards)
    • A Survey on Unit Testing Practices and Problems (daka2014survey)
      • TODO
    • The Impact of Test Case Summaries on Bug Fixing Performance: An Empirical Investigation (panichella2016impact)
      • TODO
  3. Mutation Testing
    • Is Mutation Testing Ready to be Adopted Industry-Wide? (movzucha2016mutation)
    • Investigating the Correlation between Mutation Score and Coverage Score (assylbekov2013investigating)
    • An Analysis and Survey of the Development of Mutation Testing 🌟 (jia2011analysis)
      • TODO
    • PIT: A Practical Mutation Testing Tool for Java (Demo) 🌟 (coles2016pit)
      • Well written
      • PIT is fast (manipulate bytecode), which is one of the reasons it can be used in real life
      • test selection
      • robust, easy to use, well integrated (cites 10)
    • Resolving the Equivalent Mutant Problem in the Presence of Non-determinism and Coincidental Correctness (patel2016resolving)
      • TODO
    • An Experimental Evaluation of PIT’s Mutation Operators (andersson2017experimental)
      • TODO
    • Are Mutation Scores Correlated with Real Fault Detection? (papadakis2018mutation)
      • TODO
    • A Transformational Language for Mutant Description (simao2009transformational)
      • TODO
      • unfortunately it doesn't give clues on how to describe mutants as they see mutation simply as a match-and-replace process.
      • kind of look like a formal description of the design of a DSL
    • An Experimental Evaluation of Selective Mutation (offutt1993experimental)
    • A theoretical study of fault coupling (wah2000theoretical)
    • Proteum IM 2.0: An Integrated Mutation Testing Environment 🌟🌟 (delamaro2001proteum)
      • TODO
  4. Search-based Software Testing
    • Search-based software testing: Past, present and future (mcminn2011search)
      • Already read from previous internship
    • Genetic Improvement of Software: a Comprehensive Survey (petke2017genetic)
    • Evosuite 🌟 (fraser2011evosuite) (fraser2013evosuite)
      • State-of-the-art tool
      • Very sophisticated, difficult to reproduce experiments because it changes fast and a lot of parameters are tweaked
      • minimization
        • remove unnecessary statements
        • careful not to generate long test cases
    • An Approach to Test Data Generation for Killing Multiple Mutants 🌟 (liu2006approach)
  5. Test Amplification
    • B-Refactoring: Automatic Test Code Refactoring to Improve Dynamic Analysis (xuan2016b)
      • Split tests for each fragment to cover a simple part of the control flow.
      • Help with respect to fault localization.
    • Test data regeneration: generating new test data from existing test data (yoo2012test)
    • The Emerging Field of Test Amplification: A Survey (danglot2017emerging)
      • Dense
      • Good overview of goals (Table 1) and methods (Table 2)
    • DSpot: Test Amplification for Automatic Assessment of Computational Diversity (baudry2015dspot)
      • Comparison with TDR 2 and also concurrent to 7
        • "the key differences between DSpot and TDR are: TDR stacks multiple transformations together; DSpot has more new transformation operators on test cases: DSpot considers a richer observation space based on arbitrary data types and sequences of method calls."
        • "We count the number of variants that are identified as computationally different using DSpot and TDR. "
    • A Systematic Literature Review on Test Amplification 🌟
      • TODO
    • Genetic-Improvement based Unit Test Amplification for Java 🌟
      • TODO
    • Dynamic Analysis can be Improved with Automatic Test Suite Refactoring (xuan2015dynamic)
      • TODO
    • Automatic Test Case Optimization: A Bacteriologic Algorithm (baudry2005automatic)
      • TODO
      • Compared to DSpot, no assertions generation, small programs.
  6. Automated Test Generation
    • How Do Automatically Generated Unit Tests Influence Software Maintenance? 🌟🌟 (shamshiri2018how)
      • TODO
    • Generating Unit Tests with Descriptive Names Or: Would You Name Your Children Thing1 and Thing2? 🌟🌟🌟 (daka2017generating)
      • TODO
    • An Empirical Investigation on the Readability of Manual and Generated Test Cases 🌟🌟🌟 (grano2018empirical)
      • TODO
  7. Generating natural language descriptions for software artifacts
    1. Surveys
      • Survey of Methods to Generate Natural Language from Source Code 🌟 (neubig2016survey)
        1. Survey papers
          • recommends 2
        2. Generation Methods
          1. manual rules/templates
            • SWUM 2&6
              • test cases 4 & 3
              • changes 6 & 1
              • exceptions 5
            • multiple lines description 5
              • not useful, too high level
            • using execution path information 5 & 4
              • not useful(?)
        3. Content Selection Methods
        4. Targeted Software Units
        5. Training Data Creation
        6. Evaluation
          • TODO later
      • Summarizing Software Artifacts: A Literature Review 🌟 (nazar2016summarizing)
        • very complete
      • Automatic Summarising: The State of the Art (jones2007automatic)
    2. Tools for tests
      • Automatically Documenting Software Artifacts 🌟 (li2018automatically)
        • PhD thesis
        • Chapter 4 (p. 109) on tag for unit tests
        • catalog of 21 stereotypes for methods in unit tests
          • 14 JUnit API-Based Stereotypes for Methods in Unit Test Cases
            • Boolean verifier
            • Null verifier
            • Equality verifier
            • Identity verifier
            • Utility verifier
            • Exception verifier
            • Condition Matcher
            • Assumption setter
            • Test initializer
            • Test cleaner
            • Logger
            • Ignored method
            • Hybrid verifier
            • Unclassified
          • 7 C/D-Flow Based Stereotypes for Methods in Unit Test Cases
            • Branch verifier
            • Iterative verifier
            • Public field verifier
            • API utility verifier
            • Internal call verifier
            • Execution tester
            • Empty tester
      • Automatically Documenting Unit Test Cases 🌟🌟 (li2016automatically) (git)
        • Survey with developers and projects mining study to justify automatic documentation of unit tests
        • uses a SWUM implementation in C#
        • example of templates and placeholders
        • as with other similar works it may not be useful for us
      • Towards Generating Human-Oriented Summaries of Unit Test Cases 🌟 (kamimura2013towards)
      • Automated Documentation Inference to Explain Failed Tests (zhang2011automated)
        • could be used to improve the documentation and precision of try/catch amplification
      • Automatically Identifying Focal Methods under Test in Unit Test Cases (ghafari2015automatically)
        • not useful, we are focusing on explaining edge cases
    3. Commits/Code changes
      • On Automatically Generating Commit Messages via Summarization of Source Code Changes (cortes2014automatically) ChangeScribe: A Tool for Automatically Generating Commit Messages (linares2015changescribe)
        • Good entry point for the related work
        • Classifies commit with stereotypes
        • Uses templates for sentences, and fills it with commit stereotypes (2)
        • lacks 'why' information
      • Using Stereotypes to Help Characterize Commits (dragan2011using)
        • Only categorize based on added or deleted methods
      • Towards Automatic Generation of Short Summaries of Commits (jiang2017towards)
      • Automatically Generating Commit Messages from Diffs using Neural Machine Translation (jiang2017automatically)
        • trying to be less verbose and add context
      • On Automatic Summarization of What and Why Information in Source Code Changes (shen2016automatic)
        • Better than ChangeScribe1
        • Categories of Commits in Terms of Maintenance Task and Corresponding Description (based on 3) (why information)

           Categories of commits             Description            
           Implementation          New requirements                 
           Corrective            
                                 
                                 
           Processing failure               
           Performance failure              
           Implementation failure           
           Adaptive                Change in data environment       
           Perfective            
                                 
                                 
           Processing inefficiency          
           Performance enhancement          
           Maintainability                  
           Non functional        
                                 
                                 
           Code clean-up                    
           Legal                            
           Source control system management 
        • What information: description (more like diff (ChangeDistiller) dump) of changes
        • only keep information for methods that are called many times
        • boilerplates not interesting
      • Automatically Documenting Program Changes (buse2010automatically)
        • precise description
        • nicely written, but not useful for us
      • Towards a taxonomy of software change 🌟 (buckley2005towards)
        • purely about what information
        • nice charts or table to display all possible informations
    4. General/Others
      • Comment Generation for Source Code: State of the Art, Challenges and Opportunities (https://arxiv.org/pdf/1802.02971.pdf)
        • TODO
        • Information Retrieval ("analyze the natural language clues in the source code") -> not relevant
        • Program Structure Information (summary from important statements) -> not relevant(?)
        • Software Artifacts Beyond Source Code (using the social interaction revolving around development) -> not relevant
        • Fundamental NLP Techniques -> not relevant
        • Not very useful… "current approach only generate descriptive comments"
      • The Emergent Laws of Method and Class Stereotypes in Object Oriented Software (dragan2011emergent)
        • Excerpt from PhD Thesis
        • Source of the Taxonomy of Method Stereotypes 🌟
        • C++
      • The Dimensions of Maintenance (swanson1976dimensions)
        • Foundational paper
      • JStereoCode: Automatically Identifying Method and Class Stereotypes in Java Code (moreno2012jstereocode)
        • Extending Dragan's work for Java
      • Automatic Documentation Inference for Exceptions 🌟 (buse2008automatic)
        • well written
        • could be used to improve the documentation and precision of try/catch amplification
        • nice study of percentage of what and why information in open-source projects' commit messages
      • Towards Automatically Generating Summary Comments for Java Methods 🌟 (sridhara2010towards) (+ PhD thesis)
        • well written
        • SWUM, central lines selection, …
        • again not exactly useful for us
      • Integrating Natural Language and Program Structure Information to Improve Software Search and Exploration (hill2010integrating)
        • PhD thesis
        • Source of SWUM
        • SWUM implementation as Eclipse plugin
      • Swummary: Self-Documenting Code (herbert2016swummary) (git)
        • focal method extraction -> Swum.NET
      • Automatic Source Code Summarization of Context for Java Methods (mcburney2016automatic)
        • looks very complete but again not quite useful
      • Method Execution Reports: Generating Text and Visualization to Describe Program Behavior 🌟🌟 (beck2017method)
        • good list of possible information
        • TODO
      • Towards Automatically Generating Descriptive Names for Unit Tests (zhang2016towards)
        • TODO
  8. Commits/Code survey
    • What’s a Typical Commit? A Characterization of Open Source Software Repositories (alali2008s)
      • Useful to know what terms to use
      • According to 1 the most used terms are fix, add, test, bug, patch and the most used combinations are file-fix, fix-use, add-bug, remove-test, and file-update.
    • On the Nature of Commits (hattori2008nature)
    • What do large commits tell us? A taxonomical study of large commits (hindle2008large)
      • extending 3
    • Cognitive Processes in Program Comprehension (letovsky1987cognitive)
      • Foundational paper
    • On the Naturalness of Software (hindle2012naturalness)
      • Code is repetitive and predictable
  9. Natural Language Generator
    • SimpleNLG: A realisation engine for practical applications (gatt2009simplenlg)
      • TODO
  10. Code Evolution
  11. Test Case Minimisation
  12. Not Relevant
    1. Knowledge
      • Poster: Construct Bug Knowledge Graph for Bug Resolution (wang2017construct)
      • Towards the Visualization of Usage and Decision Knowledge in Continuous Software Engineering (johanssen2017towards)
        • Pretty figures
        • Design of a tool to visualize various kinds of knowledge
      • Method Execution Reports: Generating Text and Visualization to Describe Program Behavior (beck2017method)
    2. Testing Related
      • SCOTCH: Test-to-Code Traceability using Slicing and Conceptual Coupling (qusef2011scotch)
      • ComTest: A Tool to Impart TDD and Unit Testing to Introductory Level Programming (lappalainen2010comtest)
    3. Others
      • A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes (loyola2017neural)
        • Multiple good citation to papers on NL and SE
      • Automatically Capturing Source Code Context of NL-Queries for Software Maintenance and Reuse (hill2009automatically)
      • How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms (panichella2013effectively)
        • Enhancement that doesn't really interest us
        • "in the context of three different SE tasks: (1) traceability link recovery, (2) feature location, and (3) software artifact labeling."
      • Software traceability with topic modeling (asuncion2010software)
        • "navigate the software architecture and view semantic topics associated with relevant artifacts and architectural components"
      • Automatically Detecting and Describing High Level Actions within Methods (sridhara2011automatically)
        • too high level
      • Automatic Generation of Natural Language Summaries for Java Classes (moreno2013automatic)
      • Using Method Stereotype Distribution as a Signature Descriptor for Software Systems (dragan2009using)
      • Reverse Engineering Method Stereotypes (dragan2006reverse)
      • Supporting Program Comprehension with Source Code Summarization (haiduc2010supporting)
        • motivations
      • Natural Language-based Software Analyses and Tools for Software Maintenance (pollock2009natural)
        • more about analysis than generation

2.2 Contribution

2.2.1 Minimisation

2.2.2 Focus

2.2.3 Replace original test or keep both

2.2.4 Explanation

  1. Slicing
  2. Natural Description

2.2.5 Ranking

3 Development

4 Global Goals [2/2]

4.1 DONE Report <2018-06-08 Fri 12:00>

  • Thanks all the team in report (Benjamin, Benoit, Martin)

4.2 DONE Defense <2018-06-27 Wed 14:00>

4.2.1 DONE Talk @ Workshop Software Engineering Research <2018-03-08 Thu 10:00>–<2018-03-08 Thu 10:20>

4.2.2 DONE Talk @ Workshop Software Engineering Research <2018-05-08 Tue>

4.2.3 Defense Rehearsal @ ENS <2018-06-22 Fri>

  • Actually, no

5 Journal [21/21]

5.1 DONE Preliminary Bibliographical Work <2017-09-18 Mon>–<2018-02-07 Wed>

5.1.1 Things Done

  • Meeting with Benoit <2017-09-22 Fri>
    • 1, 2, 3 issues for possible work to do
    • 1 possible work: explain if a mutant isn't killed because of oracle or input
    • focus on mutation (e.g. mutation score)
    • work will be on Dspot and PIT.
  • Read blog on PIT and Descartes
    • Sum up PIT/Descartes
    • List of wanted features
  • Meeting with Benoit <2017-11-23 Thu>
    • The purpose of DSpot has shifted right?
      • interesting to talk about the history in bibliography? No, there is a new paper
    • Enough space to talk about related work? present a few papers in details and cite others
    • Current organisation of bibliography
      • General techniques
        • Definitions
        • Mutants
        • etc
      • Useful tools
        • DSpot
    • do extensive evaluation (comparison from scratch vs amplification)
    • find literals to help tests
    • add mutation operator for specific data structures
    • stack mutations
    • add explanations
    • 3 big open problems
  • Meeting with Benoit <2017-12-22 Fri>
    • reduce only the generated tests
    • big question: minimal generated tests
      • pre or post treatement
      • order of presenting PRs
      • this is the big question
      • we don't want to touch the original suite
      • we want the programmer to understand the new tests
    • add an example of junit test
    • talk about the trend of genetic improvement
    • don't necesseraly cite Automatic software diversity in the light of test suites and Tailored source code transformations to synthesize computationally diverse program variants
  • Talk rehearsal <2018-01-28 Sun 08:30>, notes by Vladislav
    • More illustrations (workflow graph?)
    • Check the test case example (too complicated for not much, not really java)
    • Year and conference acronym in footcite
    • Careful with lambdas for TDR (check with supervisor)
    • More details on commits/pull requests and emphasize the importance of developers reviewing generated tests
    • Slide 10 -> ugly (different spacings)
    • Stacking operators: explanation too sparse
    • 4th point in conclusion slide too vague. Not just the goal but also the mean to achieve it
  • https://blog.acolyer.org/2018/01/23/why-is-random-testing-effective-for-partition-tolerance-bugs/

5.1.2 Blocking Points

5.1.3 Planned Work [6/6]

  • Read papers
  • Meeting with Benoit <2017-09-22 Fri 15:00>–<2017-09-22 Fri 15:30>
  • Meeting with Benoit <2017-11-23 Thu 15:00>–<2017-11-23 Thu 16:00>
  • Send link to repo
  • Ask Maud about plane tickets refund
  • Meeting with Benoit <2017-12-22 Fri 10:30>–<2017-12-22 Fri 11:30>

5.2 DONE Week 1 & 2 <2018-02-07 Wed>–<2018-02-18 Sun>

5.2.1 Things Done

  • Wrote the little example of use of Spoon (I simply added it in spoon-examples)
package fr.inria.gforge.spoon.transformation;

import spoon.processing.AbstractProcessor;
import spoon.reflect.code.*;

/**
 * Removes if when there is no else and if the body consists only of a return
 *
 * @author Simon Bihel
 */
public class RemoveIfReturn extends AbstractProcessor<CtIf> {

    @Override
    public void process(CtIf element) {
        CtStatement elseStmt = element.getElseStatement();
        if (elseStmt == null) { return; } // should not be an else

        CtStatement thenStmt = element.getThenStatement();
        if (thenStmt instanceof CtReturn) { // simple case with directly a then statement
            element.replace(thenStmt);
            return;
        }
        if (thenStmt instanceof CtBlock) { // case with a block which first statement is a return
            CtStatement firstStmt = ((CtBlock) thenStmt).getStatement(0);
            if (firstStmt instanceof CtReturn) {
                element.replace(thenStmt);
            }
        }
    }
}

  • Clang static analyzer for windows
    • Clang is painful to install on Windows… It requires llvm and Microsoft Visual Studio. And there is no other choice than building from source. And it requires Perl to run.
    • Should probably use CPPcheck
    • Cppcheck has a GUI and an installer for Windows. πŸ‘
    • example of bugs http://courses.cs.vt.edu/~cs1206/Fall00/bugs_CAS.html
    • no bug in the provided code
  • Software Maintenance seems to be an important keyword/field for the documentation of code
  • To what extent are documenting source code changes useful for us?
    • Only few changes made by DSpot
    • The source of the change is a tool, not a human
    • Still useful to see how they formulate features in natural language
    • DSpot doesn't add new features, we want the purpose of enhanced tests.
    • Don't really care about Pyramid method because it compares with human written messages
  • GitHub's PR templates are just plain text templates.
  • Went through papers that cited ChangeScribe. Went partly through citations by ChangeScribe.
  • Spent a lot of time on generating natural language from source code
  • Submitted a fix for a bug in vim-orgmode
  • Natural Language Generators
    • found on github, for java
      1. SimpleNLG
        • 410 stars, 215 citations
        • Seems to be just what we need
      2. NaLanGen
        • 2 stars
    • ChangeScribe seems to use a homemade generator
  • "The Software Word Usage Model (SWUM) is one of the first models of this type, and can be used for converting Java method calls into natural language statements (Hill et al., 2009)."
  • Looking at the code of DSpot to get info on generated tests
    • looks like a list of amplified test are generated and you don't know what was the amplifier

5.2.2 Blocking Points

  • Is it useful to explore approaches for augmenting the context provided by differencing tools?

5.2.3 Planned Work [6/12]

  • Read papers
  • should I register for ICST? and ICSE? -> Yes, talk/remind Benoit
  • Sign papers grant
  • Is there a Slack or something?
  • Get familiar with Spoon
    • Read paper
    • Little project, remove if when there is no else and the body is just a return.
      • Write the program
      • Write tests
  • Get familiar with Dspot
    • Running it
    • Contributing
      • Pick issues
      • Fix them
  • See boiler-plates for NLP way of building sentences.
    • a.k.a templates, placeholder templates
    • Search for papers and read them
    • Search for tools
  • Sign contract with KTH Relocation <2018-02-13 Tue 14:00>–<2018-02-13 Tue 15:30>
  • Categorize papers of preliminaries
  • Lookup what static analysis is possible with clang Cppcheck [100%]
    • find tools
    • it is for mechatronics students who write small programs for arduinos
    • show them what tests are and what's possible to discover bugs
    • Think of what they could be taught
    • Test Cppcheck on a windows machine
      • Install windows on the small computer
      • Test the code provided in the course
  • Go to EntrΓ© for badge and PIN code outside working hours
  • Run tools that I encounter in papers

5.3 DONE Week 3 <2018-02-19 Mon>–<2018-02-25 Sun>

5.3.1 Things Done

  • Work on DSpot documentation
  • Read reviews of bibliographic report
  • How to remember what amplification has been applied?
    • Go through logs
      • nothing useful in them
    • Comments directly in the code
      • name of the amplifier used in the line before
      • could easily be enriched if necessary
    • Enrich test methods with a new parameter
      • last resort
  • A json file summarizes killed mutants (with their location)
  • Need to keep focus

To select the new test case to be proposed as pull request, we look for an amplified test that kills mutants which are all located in the same method.

(this was done manually)

  • Need for automated minimization

A second point in the preparation of the pull request relates to the length of the amplified test. Once a test method has been selected as a candidate pull request, we analyze it and manually make it clearer and more concise, we call this process the manual minimization of the amplified test. We note that automated min- imization of amplified tests is an interesting area of future work, left out of the scope of this paper.

  • SWUM is really about analysis. Trying to reformulate things without making sense of them.
  • Possible title: Adaptation of Amplified Unit Tests for Human Comprehension
  • Swum.NET

UnitTestScribe also uses SWUM.NET to generate a general NL description for each unit test case method. SWUM.NET captures both linguistic and structural information about a program, and then generates a sentence describing the purpose of a source code method.

  • Started writing
  • Made a PR for vim-grammarous
  • Discussion on how to minimize generated tests

5.3.2 Blocking Points

  • Where is the "keep test that kills mutants all located in the same method"? Seems to be implemented reading the paper, but issue still open and it proposes a solution that seems different than just looking at the json file at then end of the process.
    • it was done manually

5.3.3 Planned Work [7/12]

  • Read papers
  • Register for ICST
  • Get familiar with Dspot [1/6]
    • Running it
    • Contributing
      • Pick issues
      • Fix them
    • Write documentation [2/4]
      • Key methods [3/5]
        • Assertion generation [2/2]
          • AssertGenerator
          • MethodsAssertGenerator
        • Input amplification [1/2]
          • glue
          • amplifiers
        • Pre-amplification
        • Amplification
        • Compilation & run [2/3]
          • compileAndRunTests
          • buildClassForSelection
          • TestCompiler
      • Rename amplifyTests to express the fact that it is only doing input amplification
      • compileAndRunTests
        • Why return null when not all methods were compilable or some tests failed?
      • Renaming plural variables
    • Work on removing all deprecated classes in stamp [0/1]
      • Remove unused deprecated methods of TestSelector
    • More precise try/catch?
      • Would that be useful? Feasible?
    • Extract hard-coded amplifications messages
  • Lab access denied outside working hours
    • Go to EntrΓ©
    • Go again to EntrΓ©
    • Send email to request access to the lab
      • resend
    • Resolved
  • Run tools that I encounter in papers
    • tools not really useful are they(?)
    • closing this for now
  • Find a way to know which amplifications have been applied and/or how to implement it
  • Make DHELL PR maven compiler version
  • Start writing [0/4]
    • Problem statement
      • scientific
        • quite short
      • technical
    • Comparison with works on description
      • Explaining what they do
        • badly written
        • quite short
      • Why we can't apply them for our work
    • Comparison with works on test cases minimization
      • Explaining what they do
      • Why we can't apply them for our work
    • Whether using an NLG is useful
  • Start doing a simple NL commit messages generator
    • for later, first we need minimization
  • Maybe reorganize the references on descriptions
  • Read about identify essential parts of a test for killing a specific mutant
  • Search for papers on mutation testing and same location targeting

5.4 DONE Week 4 <2018-02-26 Mon>–<2018-03-04 Sun>

5.4.1 Things Done

  • Added git hook to commit the html version of the reporting
  • Explored the use of slicing to detect the cause of new killed mutant
    • Need observation-based slicing with mutation score(?)
  • Nothing on summarization and mutation testing
    • You usually think the other way around, what do I need to do in order to kill this new mutant
  • srcSlice not supporting Java (paper)
  • JavaSlice does not support Java 8
  • Kaveri (Indus Java Program Slicer) old and eclipse plugin
  • JavaBST not available ? paper badly written
  • WALA
  • Fixed org export and also pull on server
  • Starred every vim plugin I use with Github's API and PyGitbub
  • Explored end-user description of Pitest mutators
    • Pitest has user-friendly mutators, now the question is how to use/access them
cd .. && mvn clean package -DskipTests && cd dhell && mvn clean package && java -jar ../dspot/target/dspot-1.0.6-SNAPSHOT-jar-with-dependencies.jar -p ./dspot.properties -i 1 -t eu.stamp.examples.dhell.HelloAppTest -a MethodAdd --verbose && vim dspot-out/eu.stamp.examples.dhell.HelloAppTest_mutants_killed.json

5.4.2 Blocking Points

  • NL commit message generator
    • how to know which amplifications were applied?
  • What is a program/test slice for a mutation score criterion?
    • dataflow slice starting from the killing assertion

5.4.3 Planned Work [3/9]

  • Register for ICST
  • Dspot [2/5]
    • Contributing
    • Write documentation [2/2]
      • Key methods [2/2]
        • Input amplification
          • amplifiers
        • Compilation & run
          • TestCompiler
            • no need
      • compileAndRunTests
        • Why return null when not all methods were compilable or some tests failed?
      • PR
    • Work on removing all deprecated classes in stamp
      • Remove unused deprecated methods of TestSelector
    • More precise try/catch?
      • Would that be useful? Feasible?
    • Extract hard-coded amplifications messages
  • Start writing [0/4]
    • Problem statement
      • technical
    • Comparison with works on description
      • Why we can't apply them for our work
    • Comparison with works on test cases minimization
      • Explaining what they do
      • Why we can't apply them for our work
    • Whether using an NLG is useful
  • Read about identify essential parts of a test for killing a specific mutant
  • Search for papers on mutation testing and same location targeting
  • Start doing a simple NL commit messages generator [0/2]
    • DSpot automated PR
    • Simple PR description [3/4]
      • Add a field in the killed mutants json file
      • Print it
        • done automatically
      • Stupid message
      • Long stupid description
        • Get what amplifications were applied
        • done
  • Replace fr.inria.stamp with eu.stamp
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant [1/4]
    • Need a more precise location for the mutant location
      • column number
        • not available
      • maybe I don't need it
    • Need to know the killing assertion
      • Add a trace of this when a test is kept
    • Adding as dependency
    • Use it

5.5 DONE Week 5 <2018-03-05 Mon>–<2018-03-11 Sun>

5.5.1 Things Done

  • Tried to use Sourcetrail
    • Needed to run mvn install -DskipTests -Djacoco.skip=true
    • displayed no references or class
  • Worked on presentation for the workshop
  • Proposed mutators taxonomy
    • Literal change
    • Object change
    • New assertion
  • Meeting with Benoit
    • in commit message, talk about bugs instead of mutants
    • 3 steps
      • oracle enhancement only
      • new input
      • combination
    • write why the problem is difficult
    • write different kinds of message with each a specific focus
    • maybe compare trace of amplified test vs original
    • study commit messages related to tests
  • git log --grep "^Tested"

5.5.2 Blocking Points

  • What will the scientific contribution be?
    • Software Engineering is often at the border.
    • We tackle complex problem, that the industry is not particularly interested in, at least directly.
    • applying existing methods and see if they scale or just that they can be implemented, is a contribution in itself
  • What kind of evaluation?
    • survey
    • performance
    • comparison with study of repos

5.5.3 Planned Work [3/10]

  • Talk @ Workshop Software Engineering Research <2018-03-08 Thu 10:00>–<2018-03-08 Thu 10:20>
    • Workshop <2018-03-08 Thu 09:30>–<2018-03-08 Thu 12:30>
    • Room 4523
  • Register for ICST
  • Dspot [0/1]
    • Extract hard-coded amplifications messages
  • Start writing [4/4]
    • Problem statement
      • technical
    • Comparison with works on description
      • Why we can't apply them for our work
    • Comparison with works on test cases minimization
      • Explaining what they do
        • rephrase a description from a survey or something
      • Why we can't apply them for our work
    • Whether using an NLG is useful
  • Start doing a simple NL commit messages generator
    • Long stupid description
      • Get what amplifications were applied
      • done
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant [1/4]
    • Need a more precise location for the mutant location
      • column number
        • not available
      • maybe I don't need it
    • Need to know the killing assertion
      • Add a trace of this when a test is kept
    • Adding as dependency
    • Use it
  • Fix https://github.com/STAMP-project/dspot/issues/336
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing

5.6 DONE Week 6 <2018-03-12 Mon>–<2018-03-18 Sun>

5.6.1 Things Done

5.6.2 Blocking Points

  • How to detect an amplification that modifies a statement?
    • added amplification -> easy
    • modifying amplification -> ?
      • maybe they have tags/annotations?
        • maybe I could implement that
    • use annotations during amplification process to "tag" amplified statements
  • What about a change listener to detect amplifications? and easier amplification counter
    • it is silly because we are applying amplifications
    • and big overhead
    • No, use annotations

5.6.3 Planned Work [8/17]

  • Change apartment <2018-03-15 Thu>
    • move out, hotel -> university
    • retrieve keys <2018-03-15 Thu 12:00>–<2018-03-15 Thu 16:30>
    • move in
  • Register for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/2]
    • https://github.com/STAMP-project/dspot/issues/362#issuecomment-372384168
      • it is possible to explore the AST and get the amplifications
    • documentation [4/4]
      • reduce
      • ampTestToParent
      • tmpAmpTestToParent [5/5]
        • is it a buffer to add new relations after applying mutators?
          • yes, to have rounds
        • why isn't it used everywhere?
          • ampTestToParent is directly modified
        • returns the input without modification, need refactor
        • when it is used somewhere, amplification counter not incremented
          • I was wrong
        • document it
      • StatementAdd doesn't increment amplification counter
      • these issues were resolved
    • implement annotations
    • explore trees to find amplifications
  • Clean amplification code [8/8]
    • amplification counter increment during cloning
      • two methods
        • refactor i-amp
        • refactor a-amp
      • remove original public method and make sure everything work
      • clean imports
    • rework assertion counter because 1 clone can mean many a-amplifications
      • don't increment counter in during cloning
    • removing an assertion means +1 for the counter?
      • no
    • parenting link
      • update parenting link during cloning
      • remove updating outside
      • remove plain getter
      • load buffer before starting
    • parenting map loading is ugly
      • yeah, well…
    • documentation
    • tests for verifying counter?
      • with report
    • close issue in message
  • rename Amplifier to InputAmplifier
    • too much conflicts
  • add up amplifications of parents?
    • no if a parent has amplifications it is reported
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Ask for access outside working hours
  • Respond to https://github.com/STAMP-project/dspot/issues/54
  • Understand https://github.com/STAMP-project/dspot/pull/360
  • Work on report
    • use explicit definitions
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
  • Read papers
  • Meeting with Benoit <2018-03-16 Fri>
    • Make a formal proposal of natural language description
    • Ask people what they think about it
    • Ask (Simon U, Spoon) (, XWiki) (, SAT4J) what they think of my proposal
    • Difficulties to evaluate because there isn't a lot of material (DSpot isn't an established tool).
  • formal description of the NL description
  • ask for opinions

5.7 DONE Week 7 <2018-03-19 Mon>–<2018-03-25 Sun>

5.7.1 Things Done

  • https://stackoverflow.com/questions/14040/developer-testing-vs-qa-team-testing-what-is-the-right-division-of-work
  • Precise description of NL amplification description
    1. High level description:
      • Enhancement or rework
        1. [TEST] Enhancement of <original test's name>.

        2. [TEST] New test.

      • Part of the system where mutants are

        Target <method with mutants>.

    2. Slice for each new mutant killed, starting from the killing assertion
      1. New oracle
        • NL paraphrase of the assertion

          The <checked property> is checked.

        • NL description of the impact of the mutant
          • variables with different values

            If <changed variables> have different values,

          • branches differences

            and <different branches> are explored, the test can detect them.

      2. New behaviour
        1. Enhancement (Old mutants are still detected)
          • new interactions with objects

            New interactions with <object> using <methods>.

          • if also new oracle enhancement
            • new branches during execution

        2. Rework
          • unit test documentation

            <methods> are called on <object>.

  • Learned about the visitor pattern. That name is regrettably confusing.
  • Mutant description
    • Whole method removal
      • "this method was previously not tested in a scenario where it is useful"
    • change of condition
      • "these branches were previously not tested in a scenario where they are useful"
    • change to value of variable
      • "this variable was previously not tested in a scenario where it is useful"
    • DON'T DESCRIBE MUTANTS
      • too complicated, huge repercussions, little insights
      • starting to question the relevance of focus on mutants in the same method
        • should see it the other way, check properties for this method
  • Got a cold 🀧

5.7.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?

5.7.3 Planned Work [0/12]

  • Register for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [6/9]
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report
    • use explicit definitions
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
  • Read papers
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)

5.8 DONE Week 8 <2018-03-26 Mon>–<2018-04-01 Sun>

5.8.1 Things Done

  • I did so much crap in the PR to refactor parent map 😣
  • Discovered the field of cognitive support for unit test comprehension
    • why didn't I hear about that before???
  • ID of mutator now available in report
  • Worked on the report

5.8.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
  • Need to identify the main object that is interacted with

5.8.3 Planned Work [2/12]

  • Register for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/4]
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report
    • use explicit definitions
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
  • Read papers
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)

5.9 DONE Week 9 <2018-04-02 Mon>–<2018-04-08 Sun>

5.9.1 Things Done

  • Spent time on UCL PhD application and IELTS

5.9.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
  • Need to identify the main object that is interacted with

5.9.3 Planned Work [5/17]

  • Plan ICST trip
    • hotel
      • going to take the train everyday (the train alone is 1h, one way)
    • train
  • Get reimbursed for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/4]
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [1/6]
    • use explicit definitions
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
    • add papers from 'General/Others'
  • Read papers
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • SCOTCH might actually be useful
    • check other 'Not Relevant' papers
  • Meeting with Benoit <2018-04-03 Tue>
    • Preparation for ICST
      • hotel or train everyday?
        • don't know
    • my future
      • Interesting PhD at UCL but already 3 internships that went badly
        • have to be motivated, ask questions
    • citing blog posts
      • yes
    • no subjective comment
    • a test is a subset of observable responses that are equal to specification
    • a program is complete but not specifications
    • good introduction to talk about abstract definition of verification
    • need to select subset of R (in def of test activities)
    • encapsulation principle hides stuff which means you can't observe them
      • if something is private you can't test it
    • delegations is difficult to test
    • extremely difficult to define levels of tests
    • how do you know you have a good oracle
    • fig 3, test inputs are only lines 2-5 & 7
    • be clear about what infos we have about the amplified test, how it is collected, etc.
    • speak about coverage more than mutants
    • decide between coverage enhancement, slicing or minimization, text or casual relationships
    • need a good use case throughout the thesis
  • Questions about the PhD position at UCL
  • Why is KTH blocking my ens-rennes mails
    • grey listed
    • send email to support
    • warn benoit and madeleine

5.10 DONE Week 10 <2018-04-09 Mon>–<2018-04-15 Sun>

5.10.1 Things Done

  • ICST
    • Infer, sapienz
    • 1st talk interesting, good intro on flaky tests
    • Repairnator paper
    • Tests as specifications, or something like that, a paper that gives formal definitions of tests
    • Talk about over fitting in thesis
    • Write use cases instead of evaluation with developers
    • Really good keynote for testing in the video game industry
    • Nice paper/presentation by shin hong
    • Good talk of JosΓ© rollas on my subject
      • Check references
    • Check assumptions generation for mutation testing (mutant assumption) (question for talk about mutation compression)
  • Code Defenders talk by JosΓ© RojΓ s
  • Bachelor and Master workshop
    • Zimin's talk
      • what if you predict a line in another function that is closer to another prediction in the good function

5.10.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
  • Need to identify the main object that is interacted with

5.10.3 Planned Work [3/22]

  • ICST <2018-04-10 Tue>–<2018-04-12 Thu>
  • Apply to the PhD position
  • Check advices from last week meeting with Benoit
  • Get reimbursed for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/4]
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [1/6]
    • use explicit definitions
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
    • over fitting
  • Read papers
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • SCOTCH might actually be useful
    • check other 'Not Relevant' papers
  • read code defenders papers
  • check Repairnator paper
  • Tests as specifications, or something like that, a paper that gives formal definitions of tests
  • Use cases instead of evaluation
  • Check mutant assumption
  • Read JosΓ© Rojas's ICST paper

5.11 DONE Week 11 <2018-04-16 Mon>–<2018-04-22 Sun>

5.11.1 Things Done

5.11.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
  • Need to identify the main object that is interacted with

5.11.3 Planned Work [8/21]

  • Change top mattress <2018-04-16 Mon 13:00>
  • Apply to the PhD position
  • Get reimbursed for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/4]
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [2/6]
    • work on background sections
    • add Java examples
    • insist on the distinction between what and why information
    • describe thoroughly the oracle problem
    • over fitting
    • ugly C#
  • Read papers
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • SCOTCH might actually be useful
    • check other 'Not Relevant' papers
  • read code defenders papers
  • check Repairnator paper
    • example of software development bot
  • Tests as specifications, or something like that, a paper that gives formal definitions of tests
  • Use cases instead of evaluation
  • Check mutant assumption
    • it's mutation applied at the software design level
  • Read JosΓ© Rojas's ICST paper

5.12 DONE Week 12 <2018-04-23 Mon>–<2018-04-29 Sun>

5.12.1 Things Done

  • Proofread Long's paper
  • Mattias' presentation

5.12.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
  • Assertion log, can there be REMOVE+ADD instead of MODIFY?
    • there seems to be a lot of MODIFY
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
  • Need to identify the main object that is interacted with
  • Ugly to pass the AmplificationListener around

5.12.3 Planned Work [3/17]

  • Apply to the PhD position
  • Get reimbursed for ICST
  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [1/5]
    • use another counter to keep the pointer to AST nodes of amplifications
    • add categories of modifying amplifiers
      • MODIFY LITERALS
      • MODIFY INTERACTIONS
    • write test to make sure every amplifier includes the counter update
    • write test to make sure every amplifier logs the amplifications
    • identify amplification when writing the JSON report for the test case
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [1/6]
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • check other 'Not Relevant' papers
  • read code defenders papers
  • concurrent map in amplification listener
  • upgrade jacoco version
    • not my job, and PIT is probably also not supporting java10
  • unspecify tests [4/5]
    • use a Set to avoid failing tests due to the ordering of assertions
    • testBuildNewAssertWithComment(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testBuildNewAssert(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testNoInstrumentationOnGeneratedObject(fr.inria.diversify.dspot.assertGenerator.AssertGeneratorHelperTest): expected:<..."test_sd6__1");(..)
    • testWithLoop(fr.inria.diversify.dspot.amplifier.StatementAddTest): expected:<...compute(0);(..)
    • find others that could fail
    • problem of readability when it fails

5.13 DONE Week 13 <2018-04-30 Mon>–<2018-05-06 Sun>

5.13.1 Things Done

  • DSpot on javapoet
project=.
targetModule=.
src=src/main/java/
testSrc=src/test/java
javaVersion=8
outputDirectory=dspot-out/
filter=com.squareup.javapoet.*
  • there are new amplified tests but only with new amplifications
  • DSpot on mustache.java (9.0) compiler
  • Listener only collects a-amplifications, trying something else, like a global counter
    • and it was very slow
  • First version prmessagegen
python3 main.py -amplog ../dspot/javapoet/dspot-out/com.squareup.javapoet.TypeNameTest_amp_log.json -mutants ../dspot/javapoet/dspot-out/com.squareup.javapoet.TypeNameTest_mutants_killed.json
for filename in ../dspot/javapoet/dspot-out/*_amp_log.json; do; python3 main.py -test ${filename: : -13}; done;

5.13.2 Blocking Points

  • Assertion count
    • Is the process: remove all assertions and generate all possible assertions?
    • if so then all assertions are counted as amplifications
    • it's fine
  • Should I focus solely on mutants description, amplifications descriptions or test case as a whole?
    • do everything and compare
  • Need to identify the main object that is interacted with
  • Ugly to pass the AmplificationListener around
    • stand-alone tool

5.13.3 Planned Work [12/19]

  • Simple NL commit messages generator
  • Classification of mutators
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Study commit messages related to tests
  • More precise try/catch would actually be useful for slicing
  • Retrieve amplifications [3/4]
    • add categories of modifying amplifiers
      • MODIFY LITERALS
      • MODIFY INTERACTIONS
    • write test to make sure every amplifier includes the counter update
    • write test to make sure every amplifier logs the amplifications
    • identify amplification when writing the JSON report for the test case
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [1/5]
  • formal description of the NL description
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • check other 'Not Relevant' papers
  • read code defenders papers
  • concurrent map in amplification listener
  • unspecify tests [4/5]
    • use a Set to avoid failing tests due to the ordering of assertions
    • testBuildNewAssertWithComment(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testBuildNewAssert(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testNoInstrumentationOnGeneratedObject(fr.inria.diversify.dspot.assertGenerator.AssertGeneratorHelperTest): expected:<..."test_sd6__1");(..)
    • testWithLoop(fr.inria.diversify.dspot.amplifier.StatementAddTest): expected:<...compute(0);(..)
    • find others that could fail
    • problem of readability when it fails
    • https://github.com/STAMP-project/dspot/pull/413
  • Referees reminder (deadline <2018-05-11 Fri>)
  • Answer Benoit's mail on Towards Automatically Generating Descriptive Names for Unit Tests
    • they, partly, analyse the text in the test's body
    • Action
    • Expected Outcome
    • Scenario Under Test
    • we want to explain amplifications
  • Meeting Benoit <2018-05-02 Wed 17:00>–<2018-05-02 Wed 18:00>
    • to present
      • l'Γ©tat de l'art: que font les papiers que tu as lus et qu'est-ce qui diffΓ©rencie ton approche des autres
      • l'Γ©tat de ton outil: oΓΉ est-il par rapport Γ  DSpot, qu'est-ce qu'il peut faire
      • tes plans pour une validation expΓ©rimentale de cet outil
      • tes plans d'ici la fin du stage
    • print report
    • discuss SCAM (paper deadline <2018-06-15 Fri>)
      • I'll never have a contribution and a report by <2018-06-08 Fri>, so adding a paper on top of that…
    • a table for the related works could still be nice, in addition to the text
    • abstract myself from failing tests
    • hyperlink of the line of the mutant
    • have the generator independent from DSpot
      • serialize the list of amplifications
    • multiple version of generators
    • pick a text extension
    • have a discussion on the different versions
    • use projects from the evaluation in the paper, not dhell
  • mutants reported should only be the new ones
    • can remove, afterwards, redundant mutants from parent
  • remove discarded test from the amplification log

5.14 DONE Week 14 <2018-05-07 Mon>–<2018-05-13 Sun>

5.14.1 Things Done

  • New command line usage python3 main.py -p ../dspot/javapoet/dspot.properties -t com.squareup.javapoet.JavaFileTest
  • for link previews to work on github
    • can't be inside a md link
    • can't have /./
  • Using distance metric for mutants to express how indirect the mutants detection is?
    • that's not a clear and confusing metric
  • run DSpot on jsoup
    • very long
    • NPE on minimisation
  • run DSpot on twilio
    • way too big
  • running DSpot on dhell doesn't yield anything
  • Pierre Laperdrix talk on browser fingerprinting
  • run DSpot on scribejava-core, no i-amplified test
  • tried on jodatime but JUnit too old
  • socketio test suite not passing
  • javacpp test suite not passing
  • google-java-format seems to be way too long
  • nothing for fb-contrib
  • fess is too slow
  • run on xwiki-commons-core
  • didn't work on jabref
  • didn't work on webdrivermanager

5.14.2 Blocking Points

  • Need to identify the main object that is interacted with
  • What does it mean to have only a-amp, when the original tests are already a-amplified?
    • nah, must be a problem, I don't collect all a-amps

5.14.3 Planned Work [6/15]

  • Renew SL access card <2018-05-08 Tue>
  • Workshop <2018-05-08 Tue>
    • Room 4423
  • Study for IELTS
  • Withdraw UCL application
    • withdraw online application
    • people to apologise to
      • Justyna
      • Shin
      • ENS profs
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Retrieve amplifications [0/1]
    • add categories of modifying amplifiers
      • MODIFY LITERALS
      • MODIFY INTERACTIONS
  • Retrieve mutants
    • The issue on runtime info from PIT
  • Work on report [0/4]
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • unspecify tests [4/5]
    • use a Set to avoid failing tests due to the ordering of assertions
    • testBuildNewAssertWithComment(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testBuildNewAssert(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testNoInstrumentationOnGeneratedObject(fr.inria.diversify.dspot.assertGenerator.AssertGeneratorHelperTest): expected:<..."test_sd6__1");(..)
    • testWithLoop(fr.inria.diversify.dspot.amplifier.StatementAddTest): expected:<...compute(0);(..)
    • find others that could fail
    • problem of readability when it fails
    • https://github.com/STAMP-project/dspot/pull/413
  • improvements [4/10]
    • group assertions
    • Add links in report
    • "The new test can detect if toBuilder returns XXX instead of the regular value. The original test 'toto' could not detect this fault" where XXX is the value injected by the mutation
      • isn't really suitable for other kinds of mutants, and even for return-related mutants as what might be interesting is that they change the state of the SUT, but don't have a direct relation with the test case
    • also consider when mutation modifies the state and is detected later
    • Don't name mutators, only explain the transformation instance (i.e. mutator category?)
    • Group mutants that are on the same line
    • show the generated test with a diff
      • don't put diff for assertions when there are too much (more than 10)
    • don't talk about 'new detectable bugs' but about 'assess more behavior than original' and/or 'reach more paths than original'
    • For example, "this new test assesses more behavior as the original: it can detect 5 changes in the source code that the original test could not detect:" and then show the changes (diffs)
    • handle try/catch
  • Read dspot.properties and read various info like the path
  • Measure the overhead of the amplification logging
    • javapoet
      • DSpot(master): 47m22s
      • DSpot(collect_amp): 45m52s 45m50s 52m3s
  • Make better names for generated tests
  • Don't add try/catch when @Test(expected = IllegalArgumentException.class)

5.15 DONE Week 15 <2018-05-14 Mon>–<2018-05-20 Sun>

5.15.1 Things Done

  • Spoon talk
  • https://askubuntu.com/questions/623577/no-such-file-or-directory-when-trying-to-remove-a-file-but-the-file-exists
  • DSpot was not generating inputs… 😰🀬😰🀬😰🀬😰🀬
    • java -jar ~/internship_amplification/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --path-to-properties dspot.properties --amplifiers AllLiteralAmplifiers:MethodAdd:MethodRemove:StatementAdd:ReplacementAmplifier --clean --verbose
  • run on javapoet
    • added jvmArgs=['-Xmx2048m','-Xms1024m','-XX:-UseGCOverheadLimit']
      • didn't work
    • bumped to -Xms16G -Xmx32G
    • removed
      • too messy
  • java.lang.OutOfMemoryError: GC overhead limit exceeded
    • removing weak keys (and concurrencyLevel at 4)
      • didn't change anything with -XX:-UseGCOverheadLimit
      • was taking forever without -XX:-UseGCOverheadLimit, and couldn't kill it
    • raising concurrencyLevel to 16 (and keeping weak keys)
      • didn't change anything
    • raising concurrencyLevel to 16 without weak keys
      • didn't change anything
    • same errors without the ampTestToParentName map
      • had to put it back because of the NPE
    • removed MethodRemove and AllLiteralAmplifier
      • was taking forever and couldn't kill it
    • with only StatementAdd
      • java.lang.OutOfMemoryError: GC overhead limit exceeded after an hour, instead of 30 minutes
java -jar ~/internship_amplification/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --test com.squareup.javapoet.ParameterSpecTest --cases equalsAndHashCode --path-to-properties dspot.properties --amplifiers StatementAdd --verbose
  • no result
  • xwiki-commons-crypto-cipher
java -jar ~/internship_amplification/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --path-to-properties dspot.properties --amplifiers MethodAdd --test-criterion PitMutantScoreSelector --clean --verbose
  • reproduced the results of Massol πŸ‘
  • 17min
  • java.lang.OutOfMemoryError: Java heap space with main DSpot(master), all amplifiers
  • java.lang.OutOfMemoryError: GC overhead limit exceeded with main
  • java.lang.OutOfMemoryError: Java heap space with main DSpot(master), only StatementAdd, after 1h21min DSpot(master), without MethodRemove and AllLiteralAmplifier
  • some much time gone in the wind 🀬
  • main DSpot(master)
java -jar ~/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --test com.squareup.javapoet.AbstractTypesTest --cases getBasicTypeMirror --path-to-properties dspot.properties --amplifiers StatementAdd --clean --verbose
  • didn't yield anything, but didn't crash

5.15.2 Blocking Points

  • Need to identify the main object that is interacted with
  • null parentName

5.15.3 Planned Work [8/15]

  • IELTS Speaking <2018-05-18 Fri 10:00>
  • IELTS <2018-05-19 Sat 08:30>
  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Retrieve amplifications [0/1]
    • add categories of modifying amplifiers
      • MODIFY LITERALS
      • MODIFY INTERACTIONS
  • Retrieve mutants
  • Work on report [0/4]
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
  • unspecify tests [5/5]
    • use a Set to avoid failing tests due to the ordering of assertions
    • testBuildNewAssertWithComment(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testBuildNewAssert(fr.inria.diversify.dspot.assertGenerator.MethodsAssertGeneratorTest): expected:<....junit.Assert.assert[True(((fr.inria.sample.ClassWithBoolean)cl).getTrue());(..)
    • testNoInstrumentationOnGeneratedObject(fr.inria.diversify.dspot.assertGenerator.AssertGeneratorHelperTest): expected:<..."test_sd6__1");(..)
    • testWithLoop(fr.inria.diversify.dspot.amplifier.StatementAddTest): expected:<...compute(0);(..)
    • find others that could fail
    • problem of readability when it fails
    • https://github.com/STAMP-project/dspot/pull/413
  • improvements [3/6]
    • "The new test can detect if toBuilder returns XXX instead of the regular value. The original test 'toto' could not detect this fault" where XXX is the value injected by the mutation
      • isn't really suitable for other kinds of mutants, and even for return-related mutants as what might be interesting is that they change the state of the SUT, but don't have a direct relation with the test case
    • also consider when mutation modifies the state and is detected later
    • Don't name mutators, only explain the transformation instance (i.e. mutator category?)
    • don't talk about 'new detectable bugs' but about 'assess more behavior than original' and/or 'reach more paths than original'
    • For example, "this new test assesses more behavior as the original: it can detect 5 changes in the source code that the original test could not detect:" and then show the changes (diffs)
    • handle try/catch
  • Make better names for generated tests
  • Don't add try/catch when @Test(expected = IllegalArgumentException.class)
  • send email for the job at fb
  • differentiate assert, trycatch, and test annotations
  • add DSpot and PIT forks as submodules
  • reproduce DSpot on xwiki-commons
    • asked for the xwiki-commons version

5.16 DONE Week 16 <2018-05-21 Mon>–<2018-05-27 Sun>

5.16.1 Things Done

java -jar ~/internship_amplification/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --path-to-properties dspot.properties --amplifiers MethodAdd:StatementAdd:NumberLiteralAmplifier:BooleanLiteralAmplifier --test-criterion PitMutantScoreSelector --test com.squareup.javapoet.TypeSpecTest --clean --verbose
  • didn't finish, got stuck
  • rerun with only MethodAdd -> GC overhead
  • rerun with only MethodAdd on UtilTest, no new tests
java -jar ~/internship_amplification/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar --path-to-properties dspot.properties --amplifiers MethodAdd:StatementAdd:NumberLiteralAmplifier:BooleanLiteralAmplifier --test-criterion PitMutantScoreSelector --test com.squareup.javapoet.NameAllocatorTest --clean --verbose
  • worked, in 33min

5.16.2 Blocking Points

  • Need to identify the main object that is interacted with
  • null parentName
    • not with IdentityHashMap
    • we'll keep that as we have no time to care about performances
  • total number of assertions not matching
    • sometimes total is higher (minimisation not caught?)
    • sometimes total is lower
  • no direct parent, for the ordering, so it is useless

5.16.3 Planned Work [12/23]

  • Integrate WALA to compute a slice per new mutant
    • Adding as dependency
    • Use it
  • Retrieve amplifications [1/1]
    • add categories of modifying amplifiers
      • MODIFY LITERALS
      • MODIFY INTERACTIONS
      • yeah, no, maybe in the future
  • Work on report [0/10]
    • add Java examples
    • describe thoroughly the oracle problem
    • overspecification
    • make a table for related works?
    • DSpot
      • the process of removing all assertions, then adding some, then input-amplification, then assertions
    • What have I done [0/3]
      • Conceptual contribution

        • variables created through a-amplification, that aren't useful for an

        assertion but that raise exceptions

        • long names, so NL simple syntheses are nice
      • Implementation
        • lines
      • Aside stuff
        • bugs encountered and fixed
    • Problems encountered
      • ideally we would like to have infos about whether a mutant modifies the state of the CUT or if it's something else
    • Evaluation
      • low overhead
      • it works (e.g. right total number of asserts)
    • Go through … to remember what I did [0/4]
      • previous journal entries
      • emails
      • commits
      • functions of pr_message_gen
    • Reduce the number of references
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
    • XWiki
      • get all infos necessary
      • reproduce experiment
      • ask for feedback
    • others
  • improvements [7/7]
    • "The new test can detect if toBuilder returns XXX instead of the regular value. The original test 'toto' could not detect this fault" where XXX is the value injected by the mutation
      • isn't really suitable for other kinds of mutants, and even for return-related mutants as what might be interesting is that they change the state of the SUT, but don't have a direct relation with the test case
      • yeah, no.
    • also consider when mutation modifies the state and is detected later
    • Don't name mutators, only explain the transformation instance (i.e. mutator category?)
      • PIT description
    • fix assignments with multiple lines
    • useless parent of i-amp
    • useless try/catch diff
    • use 'an' for 1
  • Fix names of generated tests and originals'
  • differentiate assert, trycatch?
    • do it in the message generator
  • Meeting Benoit <2018-05-21 Mon 16:00>
    • state of the tool
    • state of the report
    • what's planned for the coming month
    • getters, is, toString only
    • remove mutants killed by the original
    • distinguish improving coverage or new capacity
  • Group assertions for the variable they're testing
  • realise that only getters, is, and toString are used
  • realise that asserts always go through getFactory
  • remove redundant mutants
  • differentiate additional coverage to new behaviours
  • natural language description of assertions that target a variable
  • add folds for overwhelming details
  • Order tests, starting with the older one first
  • Sometimes an assertion for a variable directly test the variable (it can be simply a string), so no need to find a method call
  • Reformat assertions for a variable when there is only one assert for this variable
  • Reformat multiple identical assertions
  • New useless assert variable should be reported differently as inputs
  • Target grouping should be based on variable (not variable#method)
  • Add whole test snippet for development
  • Maybe remaining assertions should be treated like assertions that check a variable that was generated at the same time

5.17 DONE Week 17 <2018-05-28 Mon>–<2018-06-03 Sun>

5.17.1 Things Done

  • AmplificationLog with strings as key to avoid losing parents
    • no wait, shouldn't change anything, because no weak keys

5.17.2 Blocking Points

  • total number of assertions not matching
    • sometimes total is higher (minimisation not caught?)
    • sometimes total is lower
    • not using DSpot's report number for now
  • no direct parent, for the ordering, so it is useless wrong parenting no direct parent
    • using all =String=s for the parenting map doesn't work either
    • ampTestToParentName was cleared before report
      • nope
    • the problem was in prgen?? I'm the worst
    • still no direct parent
    • no direct parent because only a-amp(?)
      • yes, so two cases
        1. only better oracle
        2. also better inputs
  • Why, sometime, are there amplified tests with 3 iterations of amplifications with no logged parent, and sometimes there is 1 parent?
  • Speaking about different behaviour and different oracle defeats the goal of keeping things very simple
  • Amplifications log not kept for every test?

5.17.3 Planned Work [1/12]

  • Work on report [0/10]
    • add Java examples
    • describe thoroughly the oracle problem
    • overspecification
    • make a table for related works?
    • DSpot
      • the process of removing all assertions, then adding some, then input-amplification, then assertions
      • iterations
    • What have I done [0/3]
      • Conceptual contribution

        • variables created through a-amplification, that aren't useful for an

        assertion but that raise exceptions

        • long names, so NL simple syntheses are nice
        • identifying killing assertions isn't so great because it's not necessarily direct
      • Implementation
        • lines
      • Aside stuff
        • bugs encountered and fixed
    • Problems encountered
      • ideally we would like to have infos about whether a mutant modifies the state of the CUT or if it's something else
    • Evaluation
      • low overhead
      • it works (e.g. right total number of asserts)
    • Go through … to remember what I did [0/4]
      • previous journal entries
      • emails
      • commits
      • functions of pr_message_gen
    • Reduce the number of references
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
    • XWiki
      • ask for feedback
    • others
  • Fix names of generated tests and originals'
  • realise that only getters, is, and toString are used
  • realise that asserts always go through getFactory
  • remove redundant mutants
  • differentiate additional coverage to new behaviours
  • Sometimes an assertion for a variable directly test the variable (it can be simply a string), so no need to find a method call
  • Reformat assertions for a variable when there is only one assert for this variable
  • Target grouping should be based on variable (not variable#method)
  • Maybe remaining assertions should be treated like assertions that check a variable that was generated at the same time
  • In the case of a useless (alone) i-amp, still need to recover these to explain the a-amp that follow

5.18 DONE Week 18 <2018-06-04 Mon>–<2018-06-10 Sun>

5.18.1 Things Done

  • run on xwiki cipher [xwiki-commons-10.0] [VisualVM 1.4.1]
    • --amplifiers MethodAdd --test-criterion PitMutantScoreSelector --clean --verbose
    • fork: 20m27s, 21m40s 884MB max used heap
    • main master: 19m52s, 21m56s 834MB max used heap
  • run on javapoet
    • --amplifiers MethodAdd:StatementAdd:NumberLiteralAmplifier:BooleanLiteralAmplifier --test-criterion PitMutantScoreSelector --test com.squareup.javapoet.NameAllocatorTest --clean --verbose
    • fork:
    • main master:
    • Failed to execute goal org.pitest:pitest-maven:1.3.0:mutationCoverage (default-cli) on project javapoet: Execution default-cli of goal org.pitest:pitest-maven:1.3.0:mutationCoverage failed: All tests did not pass without mutation when calculating line coverage. Mutation testing requires a green suite.

5.18.2 Blocking Points

  • total number of assertions not matching
    • sometimes total is higher (minimisation not caught?)
    • sometimes total is lower
    • not using DSpot's report number for now
  • Why, sometime, are there amplified tests with 3 iterations of amplifications with no logged parent, and sometimes there is 1 parent?
    • not true
  • Speaking about different behaviour and different oracle defeats the goal of keeping things very simple
  • Amplifications log not kept for every test?
    • it was a problem with using the name of the class when ampliedTest == null

5.18.3 Planned Work [4/14]

  • Report <2018-06-08 Fri 12:00>
  • Renew SL access card <2018-06-07 Thu>
  • Work on report [8/11]
    • add Java examples
    • describe thoroughly the oracle problem
    • overspecification
    • make a table for related works?
    • DSpot
      • the process of removing all assertions, then adding some, then input-amplification, then assertions
      • iterations
      • minimisation
      • heavy memory usage
    • What have I done [2/3]
      • Conceptual contribution
        • variables created through a-amplification, that aren't useful for an assertion but that raise exceptions
        • long names, so NL simple syntheses are nice
        • identifying killing assertions isn't so great because it's not necessarily direct
        • assume that the original is known and understood by the developer?
        • better oracle and/or new behaviour
      • Implementation
        • lines
        • aimed at GitHub, but could be used for GitLab, but not for BitBucket (because of the html of the fold blocks)
      • Aside stuff
        • bugs encountered and fixed
    • Problems encountered
      • ideally we would like to have infos about whether a mutant modifies the state of the CUT or if it's something else
      • balance between complex information and information simplified to a useless point
      • speaking of mutants by talking about changes
      • very hard to log everything that happens, especially after with minisation where the border between original and amplification thins down
    • Evaluation
      • low overhead
      • it works (e.g. right total number of asserts)
    • Go through … to remember what I did [0/4]
      • previous journal entries
      • emails
      • commits
      • functions of pr_message_gen
    • Reduce the number of references
    • Screenshots
  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
    • XWiki
      • ask for feedback
    • others
  • realise that only getters, is, and toString are used
  • realise that asserts always go through getFactory
  • remove redundant mutants
  • differentiate additional coverage to new behaviours
  • Sometimes an assertion for a variable directly test the variable (it can be simply a string), so no need to find a method call
  • Reformat assertions for a variable when there is only one assert for this variable
  • Target grouping should be based on variable (not variable#method)
  • Maybe remaining assertions should be treated like assertions that check a variable that was generated at the same time
  • In the case of a useless (alone) i-amp, still need to recover these to explain the a-amp that follow
  • Update AmplificationLog after minimization

5.19 DONE Week 19 <2018-06-11 Mon>–<2018-06-17 Sun>

5.19.1 Things Done

5.19.2 Blocking Points

  • total number of assertions not matching
    • sometimes total is higher (minimisation not caught?)
    • sometimes total is lower
    • not using DSpot's report number for now
  • Speaking about different behaviour and different oracle defeats the goal of keeping things very simple

5.19.3 Planned Work [2/12]

  • ask for opinions (e.g. Simon U, XWiki, SAT4J)
    • XWiki
      • ask for feedback
    • others
  • realise that only getters, is, and toString are used
  • realise that asserts always go through getFactory
  • remove redundant mutants
  • differentiate additional coverage to new behaviours
  • Sometimes an assertion for a variable directly test the variable (it can be simply a string), so no need to find a method call
  • Reformat assertions for a variable when there is only one assert for this variable
  • Target grouping should be based on variable (not variable#method)
  • Maybe remaining assertions should be treated like assertions that check a variable that was generated at the same time
  • Work on slides [0/1]
    • add references
  • Return keys to Kista
  • Meeting Benoit <2018-06-13 Wed 14:00>
    • Kista keys?

5.20 DONE Week 20 <2018-06-18 Mon>–<2018-06-24 Sun>

5.20.1 Things Done

  • Threw printed papers. Hard to see all these promising annotations go.

5.20.2 Blocking Points

5.20.3 Planned Work [3/4]

  • Defense Rehearsal @ ENS <2018-06-22 Fri>
  • Hand in grant form <2018-06-21 Thu>
  • Return keys
    • office
    • apartment
  • Work on slides [0/2]
    • add references
    • a bit of state of the art

5.21 DONE Week 21 <2018-06-25 Mon>–<2018-06-27 Wed>

5.21.1 Things Done

5.21.2 Blocking Points

5.21.3 Planned Work [2/2]

  • Defense <2018-06-27 Wed 14:00> @ Jersey Room
  • Work on slides [2/2]
    • add references
    • a bit of state of the art

6 Conclusion

Date: 2018-02-07 Wed 00:00

Author: Simon Bihel

Created: 2018-06-27 Wed 14:45

Validate