Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and reject equivalent mutants #43

Open
davidkelk opened this issue May 24, 2012 · 6 comments
Open

Detect and reject equivalent mutants #43

davidkelk opened this issue May 24, 2012 · 6 comments

Comments

@davidkelk
Copy link
Member

Currently, if two different members of the population arrive at the same program structure after mutation, each is evaluated. Time would be saved if after we do the first evaluation, we record a hash of the mutated program and the result of the evaluation. For every new mutation, we can check against the hash list. If we find it, there is no need to test it again. Reject the mutation and try again.

@davidkelk
Copy link
Member Author

Perhaps it would be better to detect the repeat (and thus save on ConTest evaluations) but keep it as it will probably mutate into something different from the previous one. (Maintain diversity.)

@kevinjalbert
Copy link
Collaborator

I think that sounds right. It might be the case that the solution is actually one mutation from the already occurred state of the program and the first time it went down another path.

I think we can just skip the ConTest evaluations like you mentioned.

@davidkelk
Copy link
Member Author

Checked this in on master (8e61f91).

Lets say gen 1 mem 3 is the same as gen 1 mem 1. Is there anything that needs to be copied from 1-1 to 1-3? I'm thinking about operator weightings and so-on. Currently this patch doesn't do anything like this. It simply skips the ConTest runs.

Also, it is only for the functional phase at the moment. Is there a reason why it can't be added to the non-functional phase as well?

@davidkelk
Copy link
Member Author

I went with the maintaining diversity approach.

@kevinjalbert
Copy link
Collaborator

I think everything from 1-1 should go to 1-3 (excluding the past states if any). We are just acknowledging that this individual state has already occurred thus we don't need to re-evaluate it. We just take the already evaluated state from 1-1 and move it to 1-3 so we can evolve from that. This can influence the operator weighting as it will have duplicates influencing it.

@davidkelk
Copy link
Member Author

A short analysis of the effects of the ARC optimizations:

Super-TLDR Summary:

  • Generation per generation "optimized"-ARC (OP-ARC) is faster than "old"-ARC
  • Deadlocks are now occuring in OP-ARC, hurting performance
  • OP-ARC is taking on average one generation longer to find solutions, so ON AVERAGE
    IS SLOWER than old-ARC

TLDR Summary:

  • Analyzed 5 runs each of "old" ARC lottery runs (Table 4 of paper) and "new" optimized ARC runs
  • Generation per generation optimized ARC (OP-ARC) is about 15% faster
  • OP-ARC is detecting deadlocks now, which is hurting performance by 14%: On avg. 11 deadlocks
    per run, adding on avg. 16.5 mins per run
    • old-ARC reported no deadlocks, so I'm not sure if I have fixed or introduced a bug
  • After correcting for deadlocks:
    • When old-ARC and OP-ARC find the solution at the same member and generation, OP-ARC
      was about 20% faster
    • On average OP-ARC is taking one generation longer to find a fix.
    • Even though generation per generation OP-ARC is faster, as it takes an extra generation,
      ON AVERAGE OP-ARC is SLOWER than old-ARC

Longer Description of Analysis:

  1. From the logs, determine the average time to find a fix for Lottery:

    Old ARC: 1h 55m
    OP-ARC: 2h 23m (24% slower)

  2. Determine the average generation the fix was found for Lottery:

    Old ARC: 3, 4, 2, 1, 2 (Avg = 2.4)
    OP-ARC: 2, 6, 3, 2, 3 (Avg = 3.2, 33% slower)

  3. Count the number of deadlocks occurring during each run:

    Old ARC: 0, 0, 0, 0, 0 Avg. = 0
    OP-ARC: 0, 52, 5, 0, 0 Avg. = 11.4

Each deadlock took 94 seconds longer than a data race to evaluate.

  1. Factor out the average 11.4 deadlocks from the time calculation in 1:

    Old ARC: 1h 55m
    OP-ARC: 2h 6m (10% slower)

Deadlocks hurt performance by 14% (24% - 10%).

  1. Using the revised numbers from 4 and the average number of generations to find a fix
    from 2, determine generation-per-generation (with no deadlocks) how Old ARC and OP-ARC
    compare:

    Old ARC: 1h 55m / 2.4 = 48m
    OP-ARC: 2h 6m / 3.2 = 39m (23% faster)

  2. Compare an old-ARC run (run 1) and a OP-ARC run (run 3) where each run found the
    solution on generation 3, member 17:

6a. Time to evaluate generation 1: (No deadlocks)

Old ARC: 50m
OP-ARC: 42m (16% faster)

6b. Time to evaluate generations 1 & 2: (No deadlocks)

Old ARC: 1h 29m
OP-ARC: 1h 13m (18% faster)

6c. Time to find solution at generation 3, member 17 for both:
(5 deadlocks occurred in gen 3 for OP-ARC.)

Old ARC: 2h 19m
OP-ARC: 2h 3m (12% faster)
OP-ARC-no-deadlocks: 1h 55m (21% faster)

With deadlocks and generations factored out, generation per generation OP-ARC does run
faster than Old ARC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants