A/B testing usability [determine test group size]

There are many A/B tests around but they have one thing in common, they're meant to improve the user conversion or increase the amount of sales. My goal is not to improve the conversions or increase the sales but to verify why version A or B is better.

So i'm using a A/B test on a accounting (web)application to verify if / why a newly made design is better then the other. I understand that a A/B test wouldn't be a best practice in this example but i'm trying to achieve this by using a A/B test.

So a few questions.

1.) Is there a way to measure if version A or B is better based on usability.

Example: How can you measure (in the context of usability) if a blue button performs better then a red button.

2.) Is there a way to calculate the size of a A/B test group regarding usability (without the use of conversion rates)

Example: The webapplication has 500 daily visitors. How do you determine how big the test group would be?