Browse Source

Merge branch 'master' into unify_drivers

Conflicts:
	cgminer.c
Luke Dashjr 13 years ago
parent
commit
b8778839e9
24 changed files with 3422 additions and 1656 deletions
  1. 1 1
      AUTHORS
  2. 1 1
      Makefile.am
  3. 126 0
      NEWS
  4. 62 22
      README
  5. 79 24
      adl.c
  6. 2 0
      adl.h
  7. 671 37
      api.c
  8. 216 44
      cgminer.c
  9. 34 13
      configure.ac
  10. 10 1
      diablo120328.cl
  11. 11 61
      diakgcn120223.cl
  12. 10 2
      driver-bitforce.c
  13. 1 1
      driver-cpu.c
  14. 10 3
      driver-icarus.c
  15. 48 17
      driver-opencl.c
  16. 23 20
      logging.c
  17. 37 1
      miner.h
  18. 463 92
      miner.php
  19. 26 23
      ocl.c
  20. 1 0
      ocl.h
  21. 0 1288
      poclbm120222.cl
  22. 1353 0
      poclbm120327.cl
  23. 13 5
      util.c
  24. 224 0
      windows-build.txt

+ 1 - 1
AUTHORS

@@ -1,4 +1,4 @@
 Original CPU mining software: Jeff Garzik <jgarzik@pobox.com>
 GPU mining and rewrite: Con Kolivas <kernel@kolivas.org> 15qSxP1SQcUX3o4nhkfdbgyoWEFMomJ4rZ
 BitFORCE FPGA mining and refactor: Luke Dashjr <luke-jr+cgminer@utopios.org> 1NbRmS6a4dniwHHoSS9v3tEYUpP1Z5VVdL
-API+: Andrew Smith <kanoi@kano-kun.net> 1Jjk2LmktEQKnv8r2cZ9MvLiZwZ9gxabKm
+API+: Andrew Smith <kanoi2@kano-kun.net> 1Jjk2LmktEQKnv8r2cZ9MvLiZwZ9gxabKm

+ 1 - 1
Makefile.am

@@ -9,7 +9,7 @@ endif
 
 EXTRA_DIST	= example.conf m4/gnulib-cache.m4 linux-usb-cgminer \
 		  ADL_SDK/readme.txt api-example.php miner.php	\
-		  API.class API.java api-example.c
+		  API.class API.java api-example.c windows-build.txt
 
 SUBDIRS		= lib compat ccan
 

+ 126 - 0
NEWS

@@ -1,3 +1,129 @@
+Version 2.3.3 - April 15, 2012
+
+- Don't even display that cpumining is disabled on ./configure to discourage
+people from enabling it.
+- Do a complete cgminer restart if the ATI Display Library fails, as it does on
+windows after running for some time, when fanspeed reporting fails.
+- Cache the initial arguments passed to cgminer and implement an attempted
+restart option from the settings menu.
+- Disable per-device status lines when there are more than 8 devices since
+screen output will be corrupted, enumerating them to the log output instead at
+startup.
+- Reuse Vals[] array more than W[] till they're re-initialised on the second
+sha256 cycle in poclbm kernel.
+- Minor variable alignment in poclbm kernel.
+- Make sure to disable devices with any status not being DEV_ENABLED to ensure
+that thermal cutoff code works as it was setting the status to DEV_RECOVER.
+- Re-initialising ADL simply made the driver fail since it is corruption over
+time within the windows driver that's responsible. Revert "Attempt to
+re-initialise ADL should a device that previously reported fanspeed stops
+reporting it."
+- Microoptimise poclbm kernel by ordering Val variables according to usage
+frequency.
+
+
+Version 2.3.2 - March 31, 2012
+
+- Damping small changes in hashrate so dramatically has the tendency to always
+make the hashrate underread so go back to gentle damping instead.
+- Revert the crossover of variables from Vals to W in poclbm kernel now that
+Vals are the first declared variables so they're used more frequently.
+- Vals variables appearing first in the array in poclbm is faster.
+- Change the preferred vector width to 1 for Tahiti only, not all poclbm
+kernels.
+- Use a time constant 0.63 for when large changes in hashrate are detected to
+damp change in case the large change is an aliasing artefact instead of a real
+chang
+- Only increment stale counter if the detected stales are discarded.
+- Attempt to re-initialise ADL should a device that previously reported fanspeed
+stops reporting it.
+- Move the ADL setup and clearing to separate functions and provide a reinit_adl
+function to be used when adl fails while running.
+- Use slightly more damping on the decay time function in the never-ending quest
+to smooth off the hashmeter.
+- Set the starting fanspeed to a safe and fairly neutral 50% when autofan is
+enabled.
+- Provide locking around updates of cgpu hashrates as well to prevent multiple
+threads accessing data fields on the same device.
+- Display the beginning of the new block in verbose mode in the logs.
+- Reinstate old diablo kernel variable ordering from 120222, adding only goffset
+and vector size hint. The massive variable ordering change only helped one SDK
+on
+- Change the version number on the correct kernels.
+- api.c devicecode/osinfo incorrectly swapped for json
+- Add extensive instructions on how to make a native windows build.
+- Update version numbers of poclbm and diablo kernels as their APIs have also
+changed.
+- Use global offset parameter to diablo and poclbm kernel ONLY for 1 vector
+kernels.
+- Use poclbm preferentially on Tahiti now regardless of SDK.
+- Remove unused constant passed to poclbm.
+- Clean up use of macros in poclbm and use bitselect everywhere possible.
+- Add vector type hint to diablo kernel.
+- Add worksize and vector attribute hints to the poclbm kernel.
+- Spaces for non-aligned variables in poclbm.
+- More tidying of poclbm.
+- Swap Vals and W variables where they can overlap in poclbm.
+- More tidying of poclbm.
+- Tidy up first half of poclbm.
+- Clean up use of any() by diablo and poclbm kernels.
+- Minor variable symmetry changes in poclbm.
+- Put additions on separate lines for consistency in poclbm.
+- Consolidate last use of W11 into Vals4 in poclbm.
+- Change email due to SPAM
+- api.c miner.php add a '*' to the front of all notify counters - simplifies
+future support of new counters
+- miner.php add display 'notify' command
+- Small change to help arch's without processor affinity
+- Fix bitforce compile error
+- api.c notify should report disabled devices also - of course
+- API returns the simple device history with the 'notify' command
+- code changes for supporting a simple device history
+- api.c Report an OS string in config to help with device issues
+- api.c fix Log Interval - integer in JSON
+- api.c config 'Device Code' to show list of compiled devices + README
+- api.c increase buffer size close to current code allowable limit
+- removed 8-component vector support from kernel, as this is not supported in
+CGMINER anyway
+- forgot to update kernel modification date, fixed ;)
+- reordered an addition in the kernel, which results in less instructions used
+in the GPU ISA code for GCN
+- miner.php: option for readonly or check privileged access
+- Ignore reduntant-with-build options --disable-gpu, --no-adl, and --no-restart
+- miner.php: ereg_replace is DEPRECATED so use preg_replace instead
+- Make curses TUI support optional at compile-time.
+- Bugfix: AC_ARG_WITH provides withval instead of enableval
+- miner.php split devs output for different devices
+- api.c: correct error messages
+- icarus.c modify (regular) timeout warning to only be debug
+- icarus.c set the windows TODO timeout
+- Allow specifying a specific driver for --scan-serial
+- optimized nonce-check and output code for -v 2 and -v 4
+- Bugfix: Check for libudev header (not just library) in configure, and document
+optional dependency
+- Add API support for Icarus and Bitforce
+- Next API version is 1.4 (1.3 is current)
+- README/api.c add "When" the request was processed to STATUS
+- Bugfix: ZLX to read BitFORCE temp, not ZKX -.-
+- Use libudev to autodetect BitFORCE GPUs, if available
+- Use the return value of fan_autotune to set fan_optimal instead of passing it
+as a pointer.
+- Pass the lasttemp from the device we're using to adjust fanspeed in twin
+devices.
+- fix the name to 3 chars, fix the multi-icarus support
+- Bugfix: "-S auto" is the default if no -S is specified, and there is no such
+delay in using it
+- README add information missing from --scan-serial
+- Update README RPC API Version comment
+- Bugfix: Allow enabling CPU even without OpenCL support
+- Change failed-to-mine number of requested shares messge to avoid segfault on
+recursive calling of quit().
+- Get rid of extra char which is just truncated in poclbm kernel.
+- only small code formating changes
+- removed vec_step() as this could lead to errors on older SDKs
+- unified code for generating nonce in kernel and moved addition of base to the
+end -> faster
+
 Version 2.3.1 - February 24, 2012
 
 - Revert input and output code on diakgcn and phatk kernels to old style which

+ 62 - 22
README

@@ -44,6 +44,8 @@ Dependencies:
 	(This sdk is mandatory for GPU mining)
 	AMD ADL SDK		http://developer.amd.com/sdks/ADLSDK
 	(This sdk is mandatory for ATI GPU monitoring & clocking)
+	libudev headers
+	(This is only required for FPGA auto-detection)
 
 CGMiner specific configuration options:
 	--enable-cpumining      Build with cpu mining support(default disabled)
@@ -101,7 +103,7 @@ Basic WIN32 build instructions (LIKELY OUTDATED INFO. requires mingw32):
 	make
 	./mknsis.sh
 	
-Native WIN32 build instructions (on mingw32, on windows):
+Native WIN32 build instructions (outdated, see windows-build.txt)
 	Install the Microsoft platform SDK
 	Install AMD APP sdk, (if you want GPU mining)
 	Install AMD ADL sdk, (if you want GPU monitoring)
@@ -594,7 +596,16 @@ An example request in both formats to set GPU 0 fan to 80%:
 The format of each reply (unless stated otherwise) is a STATUS section
 followed by an optional detail section
 
-For API version 1.4:
+From API verion 1.7 onwards, reply strings in JSON and Text have the
+necessary escaping as required to avoid ambiguity - they didn't before 1.7
+For JSON the 2 characters '"' and '\' are escaped with a '\' before them
+For Text the 4 characters '|' ',' '=' and '\' are escaped the same way
+
+Only user entered information will contain characters that require being
+escaped, such as Pool URL, User and Password or the Config save filename,
+when they are returned in messages or as their values by the API
+
+For API version 1.4 and later:
 
 The STATUS section is:
 
@@ -620,7 +631,7 @@ The STATUS section is:
    This defaults to the cgminer version but is the value of --api-description
    if it was specified at runtime.
 
-For API version 1.4:
+For API version 1.7:
 
 The list of requests - a (*) means it requires privileged access - and replies are:
 
@@ -631,12 +642,14 @@ The list of requests - a (*) means it requires privileged access - and replies a
 
  config        CONFIG         Some miner configuration information:
                               GPU Count=N, <- the number of GPUs
+                              PGA Count=N, <- the number of PGAs
                               CPU Count=N, <- the number of CPUs
                               Pool Count=N, <- the number of Pools
                               ADL=X, <- Y or N if ADL is compiled in the code
                               ADL in use=X, <- Y or N if any GPU has ADL
                               Strategy=Name, <- the current pool strategy
-                              Log Interval=N| <- log interval (--log N)
+                              Log Interval=N, <- log interval (--log N)
+                              Device Code=GPU ICA | <- spaced list of compiled devices
 
  summary       SUMMARY        The status summary of the miner
                               e.g. Elapsed=NNN,Found Blocks=N,Getworks=N,...|
@@ -644,16 +657,22 @@ The list of requests - a (*) means it requires privileged access - and replies a
  pools         POOLS          The status of each pool
                               e.g. Pool=0,URL=http://pool.com:6311,Status=Alive,...|
 
- devs          DEVS           Each available CPU and GPU with their details
+ devs          DEVS           Each available GPU, PGA and CPU with their details
                               e.g. GPU=0,Accepted=NN,MHS av=NNN,...,Intensity=D|
                               Last Share Time=NNN, <- standand long time in seconds
                                (or 0 if none) of last accepted share
                               Last Share Pool=N, <- pool number (or -1 if none)
+                              Will not report PGAs if PGA mining is disabled
                               Will not report CPUs if CPU mining is disabled
 
  gpu|N         GPU            The details of a single GPU number N in the same
                               format and details as for DEVS
 
+ pga|N         PGA            The details of a single PGA number N in the same
+                              format and details as for DEVS
+                              This is only available if PGA mining is enabled
+                              Use 'pgacount' or 'config' first to see if there are any
+
  cpu|N         CPU            The details of a single CPU number N in the same
                               format and details as for DEVS
                               This is only available if CPU mining is enabled
@@ -661,6 +680,9 @@ The list of requests - a (*) means it requires privileged access - and replies a
 
  gpucount      GPUS           Count=N| <- the number of GPUs
 
+ pgacount      PGAS           Count=N| <- the number of PGAs
+                              Always returns 0 if PGA mining is disabled
+
  cpucount      CPUS           Count=N| <- the number of CPUs
                               Always returns 0 if CPU mining is disabled
 
@@ -687,6 +709,12 @@ The list of requests - a (*) means it requires privileged access - and replies a
                               stating the results of disabling pool N
                               The Msg includes the pool URL
 
+ removepool|N (*)
+               none           There is no reply section just the STATUS section
+                              stating the results of removing pool N
+                              The Msg includes the pool URL
+                              N.B. all details for the pool will be lost
+
  gpuenable|N (*)
                none           There is no reply section just the STATUS section
                               stating the results of the enable request
@@ -712,7 +740,7 @@ The list of requests - a (*) means it requires privileged access - and replies a
                               stating the results of setting GPU N clock to V MHz
 
  gpufan|N,V (*)
-                none           There is no reply section just the STATUS section
+               none           There is no reply section just the STATUS section
                               stating the results of setting GPU N fan speed to V%
 
  gpuvddc|N,V (*)
@@ -727,14 +755,28 @@ The list of requests - a (*) means it requires privileged access - and replies a
  quit (*)      none           There is no status section but just a single "BYE|"
                               reply before cgminer quits
 
+ notify        NOTIFY         The last status and history count of each devices problem
+                              e.g. NOTIFY=0,Name=GPU,ID=0,Last Well=1332432290,...|
+
  privileged (*)
                none           There is no reply section just the STATUS section
                               stating an error if you do not have privileged access
                               to the API and success if you do have privilege
                               The command doesn't change anything in cgminer
 
-When you enable, disable or restart a GPU, you will also get Thread messages in
-the cgminer status window
+ pgaenable|N (*)
+               none           There is no reply section just the STATUS section
+                              stating the results of the enable request
+                              You cannot enable a PGA if it's status is not WELL
+                              This is only available if PGA mining is enabled
+
+ pgadisable|N (*)
+               none           There is no reply section just the STATUS section
+                              stating the results of the disable request
+                              This is only available if PGA mining is enabled
+
+When you enable, disable or restart a GPU or PGA, you will also get Thread messages
+in the cgminer status window
 
 When you switch to a different pool to the current one, you will get a
 'Switching to URL' message in the cgminer status windows
@@ -767,9 +809,8 @@ api-example.c - a 'C' program to access the API (with source code)
 
 miner.php - an example web page to access the API
  This includes buttons and inputs to attempt access to the privileged commands
- You must modify the 2 lines near the top to change where it looks for cgminer
-  $miner = '127.0.0.1'; # hostname or IP address
-  $port = 4028;
+ Read the top of the file (miner.php) for details of how to tune the display
+ and also to use the option to display a multi-rig summary
 
 ---
 
@@ -846,21 +887,14 @@ any further.
 
 Q: Can you change the autofan/autogpu to change speeds in a different manner?
 A: The defaults are sane and safe. I'm not interested in changing them
-further. The starting fan speed is set to 85% in auto-fan mode as a safety
-precaution, but if a specific fan speed has been set, it will use that first
-before adjusting automatically.
-
-Q: The fanspeed starts at 85% with --auto-fan. Can I set it lower?
-A: The initial fanspeed will always start at 85% unless you choose your own
-value with --gpu-fan. In this case it will use the value you give it with
---gpu-fan as the first fanspeed, but it will also use this as the maximum fan
-speed unless overheat is detected.
+further. The starting fan speed is set to 50% in auto-fan mode as a safety
+precaution.
 
 Q: Why is my efficiency above/below 100%?
 A: Efficiency simply means how many shares you return for the amount of work
 you request. It does not correlate with efficient use of your hardware, and is
 a measure of a combination of hardware speed, block luck, pool design and other
-factors.
+factors
 
 Q: What are the best parameters to pass for X pool/hardware/device.
 A: Virtually always, the DEFAULT parameters give the best results. Most user
@@ -885,7 +919,7 @@ this time.
 
 Q: Which ATI SDK is the best for cgminer?
 A: At the moment, versions 2.4 and 2.5 work the best. If you are forced to use
-the 2.6 SDK, -v 1 might help, along with not decreasing your memory clock speed.
+the 2.6 SDK.
 
 Q: I have multiple SDKs installed, can I choose which one it uses?
 A: Run cgminer with the -n option and it will list all the platforms currently
@@ -921,6 +955,12 @@ it fail when php is installed properly but I only get errors about Sockets not
 working in the logs?
 A: http://us.php.net/manual/en/sockets.installation.php
 
+Q: What is a PGA?
+A: At the moment, cgminer supports 2 FPGA's: Icarus and BitForce.
+They are Field-Programmable Gate Arrays that have been programmed to do Bitcoin
+mining. Since the acronym needs to be only 3 characters, the "Field-" part has
+been skipped.
+
 ---
 
 This code is provided entirely free of charge by the programmer in his spare

+ 79 - 24
adl.c

@@ -12,9 +12,12 @@
 #if defined(HAVE_ADL) && (defined(__linux) || defined (WIN32))
 
 #include <stdio.h>
-#include <curses.h>
 #include <string.h>
 
+#ifdef HAVE_CURSES
+#include <curses.h>
+#endif
+
 #include "miner.h"
 #include "ADL_SDK/adl_sdk.h"
 #include "compat.h"
@@ -121,11 +124,9 @@ static bool fanspeed_twin(struct gpu_adl *ga, struct gpu_adl *other_ga)
 	return true;
 }
 
-void init_adl(int nDevs)
+static bool prepare_adl(void)
 {
-	int result, i, j, devices = 0, last_adapter = -1, gpu = 0, dummy = 0;
-	struct gpu_adapters adapters[MAX_GPUDEVICES], vadapters[MAX_GPUDEVICES];
-	bool devs_match = true;
+	int result;
 
 #if defined (LINUX)
 	hDLL = dlopen( "libatiadlxx.so", RTLD_LAZY|RTLD_GLOBAL);
@@ -138,14 +139,8 @@ void init_adl(int nDevs)
 #endif
 	if (hDLL == NULL) {
 		applog(LOG_INFO, "Unable to load ati adl library");
-		return;
-	}
-
-	if (unlikely(pthread_mutex_init(&adl_lock, NULL))) {
-		applog(LOG_ERR, "Failed to init adl_lock in init_adl");
-		return;
+		return false;
 	}
-
 	ADL_Main_Control_Create = (ADL_MAIN_CONTROL_CREATE) GetProcAddress(hDLL,"ADL_Main_Control_Create");
 	ADL_Main_Control_Destroy = (ADL_MAIN_CONTROL_DESTROY) GetProcAddress(hDLL,"ADL_Main_Control_Destroy");
 	ADL_Adapter_NumberOfAdapters_Get = (ADL_ADAPTER_NUMBEROFADAPTERS_GET) GetProcAddress(hDLL,"ADL_Adapter_NumberOfAdapters_Get");
@@ -174,7 +169,7 @@ void init_adl(int nDevs)
 		!ADL_Main_Control_Refresh || !ADL_Overdrive5_PowerControl_Get ||
 		!ADL_Overdrive5_PowerControl_Set || !ADL_Overdrive5_FanSpeedToDefault_Set) {
 			applog(LOG_WARNING, "ATI ADL's API is missing");
-		return;
+		return false;
 	}
 
 	// Initialise ADL. The second parameter is 1, which means:
@@ -182,15 +177,32 @@ void init_adl(int nDevs)
 	result = ADL_Main_Control_Create (ADL_Main_Memory_Alloc, 1);
 	if (result != ADL_OK) {
 		applog(LOG_INFO, "ADL Initialisation Error! Error %d!", result);
-		return ;
+		return false;
 	}
 
 	result = ADL_Main_Control_Refresh();
 	if (result != ADL_OK) {
 		applog(LOG_INFO, "ADL Refresh Error! Error %d!", result);
-		return ;
+		return false;
 	}
 
+	return true;
+}
+
+void init_adl(int nDevs)
+{
+	int result, i, j, devices = 0, last_adapter = -1, gpu = 0, dummy = 0;
+	struct gpu_adapters adapters[MAX_GPUDEVICES], vadapters[MAX_GPUDEVICES];
+	bool devs_match = true;
+
+	if (unlikely(pthread_mutex_init(&adl_lock, NULL))) {
+		applog(LOG_ERR, "Failed to init adl_lock in init_adl");
+		return;
+	}
+
+	if (!prepare_adl())
+		return;
+
 	// Obtain the number of adapters for the system
 	result = ADL_Adapter_NumberOfAdapters_Get (&iNumberAdapters);
 	if (result != ADL_OK) {
@@ -463,7 +475,7 @@ void init_adl(int nDevs)
 		if (opt_autofan) {
 			ga->autofan = true;
 			/* Set a safe starting default if we're automanaging fan speeds */
-			set_fanspeed(gpu, gpus[gpu].gpu_fan);
+			set_fanspeed(gpu, 50);
 		}
 		if (opt_autoengine) {
 			ga->autoengine = true;
@@ -670,6 +682,16 @@ int gpu_fanpercent(int gpu)
 	lock_adl();
 	ret = __gpu_fanpercent(ga);
 	unlock_adl();
+	if (unlikely(ga->has_fanspeed && ret == -1)) {
+		applog(LOG_WARNING, "GPU %d stopped reporting fanspeed due to driver corruption", gpu);
+		if (opt_restart) {
+			applog(LOG_WARNING, "Restart enabled, will restart cgminer");
+			applog(LOG_WARNING, "You can disable this with the --no-restart option");
+			app_restart();
+		}
+		applog(LOG_WARNING, "Disabling fanspeed monitoring on this device");
+		ga->has_fanspeed = false;
+	}
 	return ret;
 }
 
@@ -853,6 +875,7 @@ static void get_vddcrange(int gpu, float *imin, float *imax)
 	*imax = (float)ga->lpOdParameters.sVddc.iMax / 1000;
 }
 
+#ifdef HAVE_CURSES
 static float curses_float(const char *query)
 {
 	float ret;
@@ -863,6 +886,7 @@ static float curses_float(const char *query)
 	free(cvar);
 	return ret;
 }
+#endif
 
 int set_vddc(int gpu, float fVddc)
 {
@@ -995,6 +1019,10 @@ static bool fan_autotune(int gpu, int temp, int fanpercent, int lasttemp)
 	if (temp > ga->overtemp && fanpercent < iMax) {
 		applog(LOG_WARNING, "Overheat detected on GPU %d, increasing fan to 100%", gpu);
 		newpercent = iMax;
+
+		cgpu->device_last_not_well = time(NULL);
+		cgpu->device_not_well_reason = REASON_DEV_OVER_HEAT;
+		cgpu->dev_over_heat_count++;
 	} else if (temp > ga->targettemp && fanpercent < top && temp >= lasttemp) {
 		applog(LOG_DEBUG, "Temperature over target, increasing fanspeed");
 		if (temp > ga->targettemp + opt_hysteresis)
@@ -1079,9 +1107,17 @@ void gpu_autotune(int gpu, enum dev_enable *denable)
 			applog(LOG_WARNING, "Hit thermal cutoff limit on GPU %d, disabling!", gpu);
 			*denable = DEV_RECOVER;
 			newengine = ga->minspeed;
+
+			cgpu->device_last_not_well = time(NULL);
+			cgpu->device_not_well_reason = REASON_DEV_THERMAL_CUTOFF;
+			cgpu->dev_thermal_cutoff_count++;
 		} else if (temp > ga->overtemp && engine > ga->minspeed) {
 			applog(LOG_WARNING, "Overheat detected, decreasing GPU %d clock speed", gpu);
 			newengine = ga->minspeed;
+
+			cgpu->device_last_not_well = time(NULL);
+			cgpu->device_not_well_reason = REASON_DEV_OVER_HEAT;
+			cgpu->dev_over_heat_count++;
 		} else if (temp > ga->targettemp + opt_hysteresis && engine > ga->minspeed && fan_optimal) {
 			applog(LOG_DEBUG, "Temperature %d degrees over target, decreasing clock speed", opt_hysteresis);
 			newengine = engine - ga->lpOdParameters.sEngineClock.iStep;
@@ -1141,6 +1177,7 @@ void set_defaultengine(int gpu)
 	unlock_adl();
 }
 
+#ifdef HAVE_CURSES
 void change_autosettings(int gpu)
 {
 	struct gpu_adl *ga = &gpus[gpu].adl;
@@ -1297,6 +1334,18 @@ updated:
 	sleep(1);
 	goto updated;
 }
+#endif
+
+static void free_adl(void)
+{
+	ADL_Main_Memory_Free ((void **)&lpInfo);
+	ADL_Main_Control_Destroy ();
+#if defined (LINUX)
+	dlclose(hDLL);
+#else
+	FreeLibrary(hDLL);
+#endif
+}
 
 void clear_adl(int nDevs)
 {
@@ -1318,15 +1367,21 @@ void clear_adl(int nDevs)
 		ADL_Overdrive5_FanSpeed_Set(ga->iAdapterIndex, 0, &ga->DefFanSpeedValue);
 		ADL_Overdrive5_FanSpeedToDefault_Set(ga->iAdapterIndex, 0);
 	}
-
-	ADL_Main_Memory_Free ( (void **)&lpInfo );
-	ADL_Main_Control_Destroy ();
+	adl_active = false;
 	unlock_adl();
+	free_adl();
+}
 
-#if defined (LINUX)
-	dlclose(hDLL);
-#else
-	FreeLibrary(hDLL);
-#endif
+void reinit_adl(void)
+{
+	bool ret;
+	lock_adl();
+	free_adl();
+	ret = prepare_adl();
+	if (!ret) {
+		adl_active = false;
+		applog(LOG_WARNING, "Attempt to re-initialise ADL has failed, disabling");
+	}
+	unlock_adl();
 }
 #endif /* HAVE_ADL */

+ 2 - 0
adl.h

@@ -19,10 +19,12 @@ bool gpu_stats(int gpu, float *temp, int *engineclock, int *memclock, float *vdd
 void change_gpusettings(int gpu);
 void gpu_autotune(int gpu, enum dev_enable *denable);
 void clear_adl(int nDevs);
+void reinit_adl(void);
 #else /* HAVE_ADL */
 #define adl_active (0)
 static inline void init_adl(int nDevs) {}
 static inline void change_gpusettings(int gpu) { }
 static inline void clear_adl(int nDevs) {}
+static inline void reinit_adl(void) {}
 #endif
 #endif

File diff suppressed because it is too large
+ 671 - 37
api.c


+ 216 - 44
cgminer.c

@@ -11,7 +11,9 @@
 
 #include "config.h"
 
+#ifdef HAVE_CURSES
 #include <curses.h>
+#endif
 
 #include <stdio.h>
 #include <stdlib.h>
@@ -100,9 +102,9 @@ static const bool opt_time = true;
 
 #ifdef HAVE_OPENCL
 int opt_dynamic_interval = 7;
+#endif
 bool opt_restart = true;
 static bool opt_nogpu;
-#endif
 
 struct list_head scan_devices;
 int nDevs;
@@ -116,7 +118,13 @@ int gpu_threads;
 int opt_n_threads = -1;
 int mining_threads;
 int num_processors;
-bool use_curses = true;
+bool use_curses =
+#ifdef HAVE_CURSES
+	true
+#else
+	false
+#endif
+;
 static bool opt_submit_stale;
 static int opt_shares;
 static bool opt_fail_only;
@@ -141,7 +149,9 @@ int longpoll_thr_id;
 static int stage_thr_id;
 static int watchpool_thr_id;
 static int watchdog_thr_id;
+#ifdef HAVE_CURSES
 static int input_thr_id;
+#endif
 int gpur_thr_id;
 static int api_thr_id;
 static int total_threads;
@@ -151,7 +161,9 @@ struct work_restart *work_restart = NULL;
 static pthread_mutex_t hash_lock;
 static pthread_mutex_t qd_lock;
 static pthread_mutex_t *stgd_lock;
+#ifdef HAVE_CURSES
 static pthread_mutex_t curses_lock;
+#endif
 static pthread_rwlock_t blk_lock;
 pthread_rwlock_t netacc_lock;
 
@@ -179,6 +191,9 @@ enum pool_strategy pool_strategy = POOL_FAILOVER;
 int opt_rotate_period;
 static int total_urls, total_users, total_passes, total_userpasses;
 
+#ifndef HAVE_CURSES
+const
+#endif
 static bool curses_active = false;
 
 static char current_block[37];
@@ -206,6 +221,7 @@ static int include_count = 0;
 
 #if defined(unix)
 	static char *opt_stderr_cmd = NULL;
+	static int forkpid = 0;
 #endif // defined(unix)
 
 bool ping = true;
@@ -663,11 +679,14 @@ static struct opt_table opt_config_table[] = {
 	OPT_WITH_ARG("--device|-d",
 		     set_devices, NULL, NULL,
 	             "Select device to use, (Use repeat -d for multiple devices, default: all)"),
-#ifdef HAVE_OPENCL
 	OPT_WITHOUT_ARG("--disable-gpu|-G",
 			opt_set_bool, &opt_nogpu,
-			"Disable GPU mining even if suitable devices exist"),
+#ifdef HAVE_OPENCL
+			"Disable GPU mining even if suitable devices exist"
+#else
+			opt_hidden
 #endif
+	),
 #if defined(WANT_CPUMINE) && (defined(HAVE_OPENCL) || defined(USE_BITFORCE) || defined(USE_ICARUS))
 	OPT_WITHOUT_ARG("--enable-cpu|-C",
 			opt_set_bool, &opt_usecpu,
@@ -736,19 +755,25 @@ static struct opt_table opt_config_table[] = {
 	OPT_WITHOUT_ARG("--net-delay",
 			opt_set_bool, &opt_delaynet,
 			"Impose small delays in networking to not overload slow routers"),
-#ifdef HAVE_ADL
 	OPT_WITHOUT_ARG("--no-adl",
 			opt_set_bool, &opt_noadl,
-			"Disable the ATI display library used for monitoring and setting GPU parameters"),
+#ifdef HAVE_ADL
+			"Disable the ATI display library used for monitoring and setting GPU parameters"
+#else
+			opt_hidden
 #endif
+	),
 	OPT_WITHOUT_ARG("--no-longpoll",
 			opt_set_invbool, &want_longpoll,
 			"Disable X-Long-Polling support"),
-#ifdef HAVE_OPENCL
 	OPT_WITHOUT_ARG("--no-restart",
 			opt_set_invbool, &opt_restart,
-			"Do not attempt to restart GPUs that hang"),
+#ifdef HAVE_OPENCL
+			"Do not attempt to restart GPUs that hang"
+#else
+			opt_hidden
 #endif
+	),
 	OPT_WITH_ARG("--pass|-p",
 		     set_pass, NULL, NULL,
 		     "Password for bitcoin JSON-RPC server"),
@@ -828,7 +853,12 @@ static struct opt_table opt_config_table[] = {
 #endif
 	OPT_WITHOUT_ARG("--text-only|-T",
 			opt_set_invbool, &use_curses,
-			"Disable ncurses formatted screen output"),
+#ifdef HAVE_CURSES
+			"Disable ncurses formatted screen output"
+#else
+			opt_hidden
+#endif
+	),
 	OPT_WITH_ARG("--url|-o",
 		     set_url, NULL, NULL,
 		     "URL for bitcoin JSON-RPC server"),
@@ -1111,10 +1141,10 @@ void decay_time(double *f, double fadd)
 			ratio = 1 / ratio;
 	}
 
-	if (ratio > 0.95)
-		*f = (fadd * 0.1 + *f) / 1.1;
+	if (ratio > 0.63)
+		*f = (fadd * 0.58 + *f) / 1.58;
 	else
-		*f = (fadd + *f * 0.1) / 1.1;
+		*f = (fadd + *f * 0.58) / 1.58;
 }
 
 static int requests_staged(void)
@@ -1127,13 +1157,16 @@ static int requests_staged(void)
 	return ret;
 }
 
+#ifdef HAVE_CURSES
 WINDOW *mainwin, *statuswin, *logwin;
+#endif
 double total_secs = 0.1;
 static char statusline[256];
 static int devcursor, logstart, logcursor;
 struct cgpu_info gpus[MAX_GPUDEVICES]; /* Maximum number apparently possible */
 struct cgpu_info *cpus;
 
+#ifdef HAVE_CURSES
 static inline void unlock_curses(void)
 {
 	mutex_unlock(&curses_lock);
@@ -1154,6 +1187,7 @@ static bool curses_active_locked(void)
 		unlock_curses();
 	return ret;
 }
+#endif
 
 void tailsprintf(char *f, const char *fmt, ...)
 {
@@ -1192,6 +1226,7 @@ static void text_print_status(int thr_id)
 	}
 }
 
+#ifdef HAVE_CURSES
 /* Must be called with curses mutex lock held and curses_active */
 static void curses_print_status(void)
 {
@@ -1237,7 +1272,9 @@ static void curses_print_devstatus(int thr_id)
 	struct cgpu_info *cgpu = thr_info[thr_id].cgpu;
 	char logline[255];
 
-		cgpu->utility = cgpu->accepted / ( total_secs ? total_secs : 1 ) * 60;
+	cgpu->utility = cgpu->accepted / ( total_secs ? total_secs : 1 ) * 60;
+	if (total_devices > 14)
+		return;
 
 	mvwprintw(statuswin, devcursor + cgpu->cgminer_id, 0, " %s %d: ", cgpu->api->name, cgpu->device_id);
 	if (cgpu->api->get_statline_before) {
@@ -1245,10 +1282,11 @@ static void curses_print_devstatus(int thr_id)
 		cgpu->api->get_statline_before(logline, cgpu);
 		wprintw(statuswin, "%s", logline);
 	}
-		if (cgpu->status == LIFE_DEAD)
-			wprintw(statuswin, "DEAD ");
-		else if (cgpu->status == LIFE_SICK)
-			wprintw(statuswin, "SICK ");
+
+	if (cgpu->status == LIFE_DEAD)
+		wprintw(statuswin, "DEAD ");
+	else if (cgpu->status == LIFE_SICK)
+		wprintw(statuswin, "SICK ");
 	else if (cgpu->deven == DEV_DISABLED)
 		wprintw(statuswin, "OFF  ");
 	else if (cgpu->deven == DEV_RECOVER)
@@ -1272,8 +1310,9 @@ static void curses_print_devstatus(int thr_id)
 		wprintw(statuswin, "%s", logline);
 	}
 
-		wclrtoeol(statuswin);
+	wclrtoeol(statuswin);
 }
+#endif
 
 static void print_status(int thr_id)
 {
@@ -1281,6 +1320,7 @@ static void print_status(int thr_id)
 		text_print_status(thr_id);
 }
 
+#ifdef HAVE_CURSES
 /* Check for window resize. Called with curses mutex locked */
 static inline bool change_logwinsize(void)
 {
@@ -1336,7 +1376,9 @@ void wlogprint(const char *f, ...)
 		unlock_curses();
 	}
 }
+#endif
 
+#ifdef HAVE_CURSES
 void log_curses(int prio, const char *f, va_list ap)
 {
 	bool high_prio;
@@ -1366,6 +1408,7 @@ void clear_logwin(void)
 		unlock_curses();
 	}
 }
+#endif
 
 /* regenerate the full work->hash value and also return true if it's a block */
 bool regeneratehash(const struct work *work)
@@ -1700,6 +1743,7 @@ static void workio_cmd_free(struct workio_cmd *wc)
 	free(wc);
 }
 
+#ifdef HAVE_CURSES
 static void disable_curses(void)
 {
 	if (curses_active_locked()) {
@@ -1728,11 +1772,11 @@ static void disable_curses(void)
 		unlock_curses();
 	}
 }
+#endif
 
 static void print_summary(void);
 
-/* This should be the common exit path */
-void kill_work(void)
+static void __kill_work(void)
 {
 	struct thr_info *thr;
 	int i;
@@ -1779,11 +1823,37 @@ void kill_work(void)
 	applog(LOG_DEBUG, "Killing off API thread");
 	thr = &thr_info[api_thr_id];
 	thr_info_cancel(thr);
+}
+
+/* This should be the common exit path */
+void kill_work(void)
+{
+	__kill_work();
 
 	quit(0, "Shutdown signal received.");
 }
 
-void quit(int status, const char *format, ...);
+static char **initial_args;
+
+static void clean_up(void);
+
+void app_restart(void)
+{
+	applog(LOG_WARNING, "Attempting to restart %s", packagename);
+
+	__kill_work();
+	clean_up();
+
+#if defined(unix)
+	if (forkpid > 0) {
+		kill(forkpid, SIGTERM);
+		forkpid = 0;
+	}
+#endif
+
+	execv(initial_args[0], initial_args);
+	applog(LOG_WARNING, "Failed to restart application");
+}
 
 static void sighandler(int __maybe_unused sig)
 {
@@ -1882,16 +1952,16 @@ static void *submit_work_thread(void *userdata)
 	pthread_detach(pthread_self());
 
 	if (stale_work(work, true)) {
-		total_stale++;
-		pool->stale_shares++;
-		if (!opt_submit_stale && !pool->submit_old) {
-			applog(LOG_NOTICE, "Stale share detected, discarding");
-			goto out;
-		}
 		if (opt_submit_stale)
 			applog(LOG_NOTICE, "Stale share detected, submitting as user requested");
 		else if (pool->submit_old)
 			applog(LOG_NOTICE, "Stale share detected, submitting as pool requested");
+		else {
+			applog(LOG_NOTICE, "Stale share detected, discarding");
+			total_stale++;
+			pool->stale_shares++;
+			goto out;
+		}
 	}
 
 	/* submit solution to bitcoin via JSON-RPC */
@@ -2112,6 +2182,7 @@ static void set_curblock(char *hexstr, unsigned char *hash)
 	current_hash = bin2hex(hash_swap, 16);
 	if (unlikely(!current_hash))
 		quit (1, "set_curblock OOM");
+	applog(LOG_INFO, "New block: %s...", current_hash);
 	if (old_hash)
 		free(old_hash);
 }
@@ -2259,6 +2330,7 @@ static bool stage_work(struct work *work)
 	return true;
 }
 
+#ifdef HAVE_CURSES
 int curses_int(const char *query)
 {
 	int ret;
@@ -2269,8 +2341,11 @@ int curses_int(const char *query)
 	free(cvar);
 	return ret;
 }
+#endif
 
+#ifdef HAVE_CURSES
 static bool input_pool(bool live);
+#endif
 
 int active_pools(void)
 {
@@ -2284,6 +2359,7 @@ int active_pools(void)
 	return ret;
 }
 
+#ifdef HAVE_CURSES
 static void display_pool_summary(struct pool *pool)
 {
 	double efficiency = 0.0;
@@ -2307,10 +2383,11 @@ static void display_pool_summary(struct pool *pool)
 		unlock_curses();
 	}
 }
+#endif
 
 /* We can't remove the memory used for this struct pool because there may
  * still be work referencing it. We just remove it from the pools list */
-static void remove_pool(struct pool *pool)
+void remove_pool(struct pool *pool)
 {
 	int i, last_pool = total_pools - 1;
 	struct pool *other;
@@ -2481,6 +2558,7 @@ void write_config(FILE *fcfg)
 	fputs("\n}", fcfg);
 }
 
+#ifdef HAVE_CURSES
 static void display_pools(void)
 {
 	struct pool *pool;
@@ -2685,10 +2763,12 @@ retry:
 	immedok(logwin, false);
 	opt_loginput = false;
 }
+#endif
 
 static void start_longpoll(void);
 static void stop_longpoll(void);
 
+#ifdef HAVE_CURSES
 static void set_options(void)
 {
 	int selected;
@@ -2699,7 +2779,8 @@ static void set_options(void)
 	clear_logwin();
 retry:
 	wlogprint("\n[L]ongpoll: %s\n", want_longpoll ? "On" : "Off");
-	wlogprint("[Q]ueue: %d\n[S]cantime: %d\n[E]xpiry: %d\n[R]etries: %d\n[P]ause: %d\n[W]rite config file\n",
+	wlogprint("[Q]ueue: %d\n[S]cantime: %d\n[E]xpiry: %d\n[R]etries: %d\n"
+		  "[P]ause: %d\n[W]rite config file\n[C]gminer restart\n",
 		opt_queue, opt_scantime, opt_expiry, opt_retries, opt_fail_pause);
 	wlogprint("Select an option or any other key to return\n");
 	input = getch();
@@ -2792,6 +2873,13 @@ retry:
 		fclose(fcfg);
 		goto retry;
 
+	} else if (!strncasecmp(&input, "c", 1)) {
+		wlogprint("Are you sure?\n");
+		input = getch();
+		if (!strncasecmp(&input, "y", 1))
+			app_restart();
+		else
+			clear_logwin();
 	} else
 		clear_logwin();
 
@@ -2829,6 +2917,7 @@ static void *input_thread(void __maybe_unused *userdata)
 
 	return NULL;
 }
+#endif
 
 /* This thread should not be shut down unless a problem occurs */
 static void *workio_thread(void *userdata)
@@ -2888,6 +2977,7 @@ void thread_reportin(struct thr_info *thr)
 	gettimeofday(&thr->last, NULL);
 	thr->cgpu->status = LIFE_WELL;
 	thr->getwork = false;
+	thr->cgpu->device_last_well = time(NULL);
 }
 
 static inline void thread_reportout(struct thr_info *thr)
@@ -2908,8 +2998,10 @@ static void hashmeter(int thr_id, struct timeval *diff,
 	bool showlog = false;
 
 	/* Update the last time this thread reported in */
-	if (thr_id >= 0)
+	if (thr_id >= 0) {
 		gettimeofday(&thr_info[thr_id].last, NULL);
+		thr_info[thr_id].cgpu->device_last_well = time(NULL);
+	}
 
 	/* Don't bother calculating anything if we're not displaying it */
 	if (opt_realquiet || !opt_log_interval)
@@ -2935,8 +3027,10 @@ static void hashmeter(int thr_id, struct timeval *diff,
 			if (th->cgpu == cgpu)
 				thread_rolling += th->rolling;
 		}
+		mutex_lock(&hash_lock);
 		decay_time(&cgpu->rolling, thread_rolling);
 		cgpu->total_mhashes += local_mhashes;
+		mutex_unlock(&hash_lock);
 
 		// If needed, output detailed, per-device stats
 		if (want_per_device_stats) {
@@ -3423,8 +3517,13 @@ void *miner_thread(void *userdata)
 	bool requested = false;
 	pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
 
-	if (api->thread_init && !api->thread_init(mythr))
+	if (api->thread_init && !api->thread_init(mythr)) {
+		cgpu->device_last_not_well = time(NULL);
+		cgpu->device_not_well_reason = REASON_THREAD_FAIL_INIT;
+		cgpu->thread_fail_init_count++;
+
 		goto out;
+	}
 
 	thread_reportout(mythr);
 	applog(LOG_DEBUG, "Popping ping in miner thread");
@@ -3473,8 +3572,14 @@ void *miner_thread(void *userdata)
 				break;
 			}
 
-			if (unlikely(!hashes))
+			if (unlikely(!hashes)) {
+				cgpu->device_last_not_well = time(NULL);
+				cgpu->device_not_well_reason = REASON_THREAD_ZERO_HASH;
+				cgpu->thread_zero_hash_count++;
+
 				goto out;
+			}
+
 			hashes_done += hashes;
 			if (hashes > cgpu->max_hashes)
 				cgpu->max_hashes = hashes;
@@ -3494,6 +3599,11 @@ void *miner_thread(void *userdata)
 					thread_reportout(mythr);
 					if (unlikely(!queue_request(mythr, false))) {
 						applog(LOG_ERR, "Failed to queue_request in miner_thread %d", thr_id);
+
+						cgpu->device_last_not_well = time(NULL);
+						cgpu->device_not_well_reason = REASON_THREAD_FAIL_QUEUE;
+						cgpu->thread_fail_queue_count++;
+
 						goto out;
 					}
 					thread_reportin(mythr);
@@ -3526,7 +3636,7 @@ void *miner_thread(void *userdata)
 				tv_lastupdate = tv_end;
 			}
 
-			if (unlikely(mythr->pause || cgpu->deven == DEV_DISABLED)) {
+			if (unlikely(mythr->pause || cgpu->deven != DEV_ENABLED)) {
 				applog(LOG_WARNING, "Thread %d being disabled", thr_id);
 				mythr->rolling = mythr->cgpu->rolling = 0;
 				applog(LOG_DEBUG, "Popping wakeup ping in miner thread");
@@ -3709,6 +3819,7 @@ out:
 	return NULL;
 }
 
+__maybe_unused
 static void stop_longpoll(void)
 {
 	struct thr_info *thr = &thr_info[longpoll_thr_id];
@@ -3795,6 +3906,7 @@ static void *watchdog_thread(void __maybe_unused *userdata)
 
 		hashmeter(-1, &zero_tv, 0);
 
+#ifdef HAVE_CURSES
 		if (curses_active_locked()) {
 			change_logwinsize();
 			curses_print_status();
@@ -3806,6 +3918,7 @@ static void *watchdog_thread(void __maybe_unused *userdata)
 			wrefresh(logwin);
 			unlock_curses();
 		}
+#endif
 
 		gettimeofday(&now, NULL);
 
@@ -3879,11 +3992,16 @@ static void *watchdog_thread(void __maybe_unused *userdata)
 			if (gpus[gpu].status != LIFE_WELL && now.tv_sec - thr->last.tv_sec < 60) {
 				applog(LOG_ERR, "Device %d recovered, GPU %d declared WELL!", i, gpu);
 				gpus[gpu].status = LIFE_WELL;
+				gpus[gpu].device_last_well = time(NULL);
 			} else if (now.tv_sec - thr->last.tv_sec > 60 && gpus[gpu].status == LIFE_WELL) {
 				thr->rolling = thr->cgpu->rolling = 0;
 				gpus[gpu].status = LIFE_SICK;
 				applog(LOG_ERR, "Device %d idle for more than 60 seconds, GPU %d declared SICK!", i, gpu);
 				gettimeofday(&thr->sick, NULL);
+
+				gpus[gpu].device_last_not_well = time(NULL);
+				gpus[gpu].device_not_well_reason = REASON_DEV_SICK_IDLE_60;
+				gpus[gpu].dev_sick_idle_60_count++;
 #ifdef HAVE_ADL
 				if (adl_active && gpus[gpu].has_adl && gpu_activity(gpu) > 50) {
 					applog(LOG_ERR, "GPU still showing activity suggesting a hard hang.");
@@ -3898,6 +4016,10 @@ static void *watchdog_thread(void __maybe_unused *userdata)
 				gpus[gpu].status = LIFE_DEAD;
 				applog(LOG_ERR, "Device %d not responding for more than 10 minutes, GPU %d declared DEAD!", i, gpu);
 				gettimeofday(&thr->sick, NULL);
+
+				gpus[gpu].device_last_not_well = time(NULL);
+				gpus[gpu].device_not_well_reason = REASON_DEV_DEAD_IDLE_600;
+				gpus[gpu].dev_dead_idle_600_count++;
 			} else if (now.tv_sec - thr->sick.tv_sec > 60 &&
 				   (gpus[i].status == LIFE_SICK || gpus[i].status == LIFE_DEAD)) {
 				/* Attempt to restart a GPU that's sick or dead once every minute */
@@ -3921,8 +4043,8 @@ static void log_print_status(struct cgpu_info *cgpu)
 {
 	char logline[255];
 
-		get_statline(logline, cgpu);
-		applog(LOG_WARNING, "%s", logline);
+	get_statline(logline, cgpu);
+	applog(LOG_WARNING, "%s", logline);
 }
 
 static void print_summary(void)
@@ -4010,7 +4132,9 @@ static void clean_up(void)
 #endif
 
 	gettimeofday(&total_tv_end, NULL);
+#ifdef HAVE_CURSES
 	disable_curses();
+#endif
 	if (!opt_realquiet && successful_connect)
 		print_summary();
 
@@ -4034,9 +4158,17 @@ void quit(int status, const char *format, ...)
 	fprintf(stderr, "\n");
 	fflush(stderr);
 
+#if defined(unix)
+	if (forkpid > 0) {
+		kill(forkpid, SIGTERM);
+		forkpid = 0;
+	}
+#endif
+
 	exit(status);
 }
 
+#ifdef HAVE_CURSES
 char *curses_input(const char *query)
 {
 	char *input;
@@ -4054,6 +4186,7 @@ char *curses_input(const char *query)
 	noecho();
 	return input;
 }
+#endif
 
 int add_pool_details(bool live, char *url, char *user, char *pass)
 {
@@ -4089,6 +4222,7 @@ int add_pool_details(bool live, char *url, char *user, char *pass)
 	return ADD_POOL_OK;
 }
 
+#ifdef HAVE_CURSES
 static bool input_pool(bool live)
 {
 	char *url = NULL, *user = NULL, *pass = NULL;
@@ -4140,6 +4274,7 @@ out:
 	}
 	return ret;
 }
+#endif
 
 #if defined(unix)
 	static void fork_monitor()
@@ -4174,14 +4309,14 @@ out:
 		}
 
 		// Fork a child process
-		r = fork();
-		if (r<0) {
+		forkpid = fork();
+		if (forkpid<0) {
 			perror("fork - failed to fork child process for --monitor");
 			exit(1);
 		}
 
 		// Child: launch monitor command
-		if (0==r) {
+		if (0==forkpid) {
 			// Make stdin read end of pipe
 			r = dup2(pfd[0], 0);
 			if (r<0) {
@@ -4209,6 +4344,7 @@ out:
 	}
 #endif // defined(unix)
 
+#ifdef HAVE_CURSES
 void enable_curses(void) {
 	int x,y;
 
@@ -4231,6 +4367,7 @@ void enable_curses(void) {
 	curses_active = true;
 	unlock_curses();
 }
+#endif
 
 /* TODO: fix need a dummy CPU device_api even if no support for CPU mining */
 #ifndef WANT_CPUMINE
@@ -4288,7 +4425,7 @@ bool add_cgpu(struct cgpu_info*cgpu)
 	return true;
 }
 
-int main (int argc, char *argv[])
+int main(int argc, char *argv[])
 {
 	struct block *block, *tmpblock;
 	struct work *work, *tmpwork;
@@ -4303,9 +4440,16 @@ int main (int argc, char *argv[])
 	if (unlikely(curl_global_init(CURL_GLOBAL_ALL)))
 		quit(1, "Failed to curl_global_init");
 
+	initial_args = malloc(sizeof(char *) * (argc + 1));
+	for  (i = 0; i < argc; i++)
+		initial_args[i] = strdup(argv[i]);
+	initial_args[argc] = NULL;
+
 	mutex_init(&hash_lock);
 	mutex_init(&qd_lock);
+#ifdef HAVE_CURSES
 	mutex_init(&curses_lock);
+#endif
 	mutex_init(&control_lock);
 	rwlock_init(&blk_lock);
 	rwlock_init(&netacc_lock);
@@ -4387,8 +4531,13 @@ int main (int argc, char *argv[])
 		successful_connect = true;
 	}
 
+#ifdef HAVE_CURSES
+	if (opt_realquiet || devices_enabled == -1)
+		use_curses = false;
+
 	if (use_curses)
 		enable_curses();
+#endif
 
 	applog(LOG_WARNING, "Started %s", packagename);
 
@@ -4493,17 +4642,29 @@ int main (int argc, char *argv[])
 
 	load_temp_cutoffs();
 
-	logstart += total_devices;
+	if (total_devices <= 14) {
+		logstart += total_devices;
+	} else {
+		applog(LOG_NOTICE, "Too many devices exist for per-device status lines");
+		for (i = 0; i < total_devices; ++i) {
+			struct cgpu_info *cgpu = devices[i];
+
+			applog(LOG_NOTICE, "%s%d: %s", cgpu->api->name, cgpu->device_id,
+				cgpu->deven == DEV_ENABLED? "Enabled" : "Disabled");
+		}
+		applog(LOG_NOTICE, "%d devices found, disabling per-device status lines", total_devices);
+	}
 	logcursor = logstart + 1;
 
+#ifdef HAVE_CURSES
 	check_winsizes();
-
-	if (opt_realquiet)
-		use_curses = false;
+#endif
 
 	if (!total_pools) {
 		applog(LOG_WARNING, "Need to specify at least one pool server.");
-		if (!use_curses || (use_curses && !input_pool(false)))
+#ifdef HAVE_CURSES
+		if (!use_curses || !input_pool(false))
+#endif
 			quit(1, "Pool setup failed");
 	}
 
@@ -4625,6 +4786,7 @@ int main (int argc, char *argv[])
 				applog(LOG_WARNING, "Pool: %d  URL: %s  User: %s  Password: %s",
 				       i, pool->rpc_url, pool->rpc_user, pool->rpc_pass);
 			}
+#ifdef HAVE_CURSES
 			if (use_curses) {
 				halfdelay(150);
 				applog(LOG_ERR, "Press any key to exit, or cgminer will try again in 15s.");
@@ -4632,6 +4794,7 @@ int main (int argc, char *argv[])
 					quit(0, "No servers could be used! Exiting.");
 				nocbreak();
 			} else
+#endif
 				quit(0, "No servers could be used! Exiting.");
 		}
 	} while (!pools_active);
@@ -4736,6 +4899,7 @@ begin_bench:
 		quit(1, "API thread create failed");
 	pthread_detach(thr->pth);
 
+#ifdef HAVE_CURSES
 	/* Create curses input thread for keyboard input. Create this last so
 	 * that we know all threads are created since this can call kill_work
 	 * to try and shut down ll previous threads. */
@@ -4744,6 +4908,7 @@ begin_bench:
 	if (thr_info_create(thr, NULL, input_thread, thr))
 		quit(1, "input thread create failed");
 	pthread_detach(thr->pth);
+#endif
 
 	/* main loop - simply wait for workio thread to exit. This is not the
 	 * normal exit path and only occurs should the workio_thread die
@@ -4763,5 +4928,12 @@ begin_bench:
 		free(block);
 	}
 
+#if defined(unix)
+	if (forkpid > 0) {
+		kill(forkpid, SIGTERM);
+		forkpid = 0;
+	}
+#endif
+
 	return 0;
 }

+ 34 - 13
configure.ac

@@ -2,7 +2,7 @@
 ##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
 m4_define([v_maj], [2])
 m4_define([v_min], [3])
-m4_define([v_mic], [1])
+m4_define([v_mic], [3])
 ##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
 m4_define([v_ver], [v_maj.v_min.v_mic])
 m4_define([lt_rev], m4_eval(v_maj + v_min))
@@ -207,12 +207,33 @@ if test "x$icarus" = xyes; then
 fi
 AM_CONDITIONAL([HAS_ICARUS], [test x$icarus = xyes])
 
-AC_SEARCH_LIBS(addstr, ncurses pdcurses, ,
-        AC_MSG_ERROR([Could not find curses library - please install libncurses-dev or pdcurses-dev]))
 
-AC_CHECK_LIB(ncurses, addstr, NCURSES_LIBS=-lncurses)
-AC_CHECK_LIB(pdcurses, addstr, PDCURSES_LIBS=-lpdcurses)
+curses="auto"
 
+AC_ARG_WITH([curses],
+	[AC_HELP_STRING([--without-curses],[Compile support for curses TUI (default enabled)])],
+	[curses=$withval]
+	)
+if test "x$curses" = "xno"; then
+	cursesmsg='User specified --without-curses. TUI support DISABLED'
+else
+	AC_SEARCH_LIBS(addstr, ncurses pdcurses, [
+		curses=yes
+		cursesmsg="FOUND: ${ac_cv_search_addstr:2}"
+		AC_DEFINE([HAVE_CURSES], [1], [Defined to 1 if curses TUI support is wanted])
+	], [
+		if test "x$curses" = "xyes"; then
+			AC_MSG_ERROR([Could not find curses library - please install libncurses-dev or pdcurses-dev (or configure --without-curses)])
+		else
+			AC_MSG_WARN([Could not find curses library - if you want a TUI, install libncurses-dev or pdcurses-dev])
+			curses=no
+			cursesmsg='NOT FOUND. TUI support DISABLED'
+		fi
+	])
+fi
+
+
+AM_CONDITIONAL([HAVE_CURSES], [test x$curses = xyes])
 AM_CONDITIONAL([WANT_JANSSON], [test x$request_jansson = xtrue])
 AM_CONDITIONAL([HAVE_WINDOWS], [test x$have_win32 = xtrue])
 AM_CONDITIONAL([HAVE_x86_64], [test x$have_x86_64 = xtrue])
@@ -261,12 +282,12 @@ fi
 AM_CONDITIONAL([HAS_YASM], [test x$has_yasm = xtrue])
 
 if test "x$bitforce" != xno; then
-	AC_ARG_WITH([libudev], [AC_HELP_STRING([--with-libudev], [Autodetect FPGAs using libudev])],
-		[libudev=$enableval],
+	AC_ARG_WITH([libudev], [AC_HELP_STRING([--without-libudev], [Autodetect FPGAs using libudev (default enabled)])],
+		[libudev=$withval],
 		[libudev=auto]
 		)
 	if test "x$libudev" != "xno"; then
-		AC_CHECK_LIB([udev], [udev_device_get_devnode], [
+		AC_CHECK_HEADER([libudev.h],[
 			libudev=yes
 			UDEV_LIBS=-ludev
 			AC_DEFINE([HAVE_LIBUDEV], [1], [Defined to 1 if libudev is wanted])
@@ -325,9 +346,9 @@ fi
 AC_DEFINE_UNQUOTED([CGMINER_PREFIX], ["$prefix/bin"], [Path to cgminer install])
 
 AC_DEFINE_UNQUOTED([PHATK_KERNNAME], ["phatk120223"], [Filename for phatk kernel])
-AC_DEFINE_UNQUOTED([POCLBM_KERNNAME], ["poclbm120222"], [Filename for poclbm kernel])
+AC_DEFINE_UNQUOTED([POCLBM_KERNNAME], ["poclbm120327"], [Filename for poclbm kernel])
 AC_DEFINE_UNQUOTED([DIAKGCN_KERNNAME], ["diakgcn120223"], [Filename for diakgcn kernel])
-AC_DEFINE_UNQUOTED([DIABLO_KERNNAME], ["diablo120222"], [Filename for diablo kernel])
+AC_DEFINE_UNQUOTED([DIABLO_KERNNAME], ["diablo120328"], [Filename for diablo kernel])
 
 
 AC_SUBST(OPENCL_LIBS)
@@ -365,6 +386,8 @@ echo
 echo "Configuration Options Summary:"
 echo
 
+echo "  curses.TUI...........: $cursesmsg"
+
 if test "x$opencl" != xno; then
 	if test $found_opencl = 1; then
 		echo "  OpenCL...............: FOUND. GPU mining support enabled"
@@ -408,12 +431,10 @@ if test "x$bitforce" != xno; then
 	echo "  libudev.detection....: $libudev"
 fi
 
-echo
 if test "x$cpumining" = xyes; then
+	echo
 	echo "  CPU Mining...........: Enabled"
 	echo "  ASM.(for CPU mining).: $has_yasm"
-else
-	echo "  CPU Mining...........: Disabled"
 fi
 
 echo

+ 10 - 1
diablo120222.cl → diablo120328.cl

@@ -44,8 +44,13 @@
 #define ZR26(n) ((Zrotr((n), 26) ^ Zrotr((n), 21) ^ Zrotr((n), 7)))
 #define ZR30(n) ((Zrotr((n), 30) ^ Zrotr((n), 19) ^ Zrotr((n), 10)))
 
-__kernel __attribute__((reqd_work_group_size(WORKSIZE, 1, 1))) void search(
+__kernel
+__attribute__((vec_type_hint(z)))
+__attribute__((reqd_work_group_size(WORKSIZE, 1, 1)))
+void search(
+#ifndef GOFFSET
     const z base,
+#endif
     const uint PreVal4_state0, const uint PreVal4_state0_k7,
     const uint PreVal4_T1,
     const uint W18, const uint W19,
@@ -62,7 +67,11 @@ __kernel __attribute__((reqd_work_group_size(WORKSIZE, 1, 1))) void search(
 
   z ZA[930];
 
+#ifdef GOFFSET
+	const z Znonce = (uint)(get_global_id(0));
+#else
 	const z Znonce = base + (uint)(get_global_id(0));
+#endif
 
     ZA[15] = Znonce + PreVal4_state0;
     

+ 11 - 61
diakgcn120223.cl

@@ -1,11 +1,9 @@
-// DiaKGCN 24-02-2012 - OpenCL kernel by Diapolo
+// DiaKGCN 16-03-2012 - OpenCL kernel by Diapolo
 //
 // Parts and / or ideas for this kernel are based upon the public-domain poclbm project, the phatk kernel by Phateus and the DiabloMiner kernel by DiabloD3.
 // The kernel was rewritten by me (Diapolo) and is still public-domain!
 
-#ifdef VECTORS8
-	typedef uint8 u;
-#elif defined VECTORS4
+#ifdef VECTORS4
 	typedef uint4 u;
 #elif defined VECTORS2
 	typedef uint2 u;
@@ -53,9 +51,7 @@ __kernel
 	u V[8];
 	u W[16];
 
-#ifdef VECTORS8
-	const u nonce = (uint)(get_local_id(0)) * 8U + (uint)(get_group_id(0)) * (uint)(WORKVEC) + base;
-#elif defined VECTORS4
+#ifdef VECTORS4
 	const u nonce = (uint)(get_local_id(0)) * 4U + (uint)(get_group_id(0)) * (uint)(WORKVEC) + base;
 #elif defined VECTORS2
 	const u nonce = (uint)(get_local_id(0)) * 2U + (uint)(get_group_id(0)) * (uint)(WORKVEC) + base;
@@ -116,9 +112,7 @@ __kernel
 
 //----------------------------------------------------------------------------------
 
-#ifdef VECTORS8
-	 W[0] = PreW18 + (u)(rotr25(nonce.s0), rotr25(nonce.s0) ^ 0x2004000U, rotr25(nonce.s0) ^ 0x4008000U, rotr25(nonce.s0) ^ 0x600c000U, rotr25(nonce.s0) ^ 0x8010000U, rotr25(nonce.s0) ^ 0xa014000U, rotr25(nonce.s0) ^ 0xc018000U, rotr25(nonce.s0) ^ 0xe01c000U);
-#elif defined VECTORS4
+#ifdef VECTORS4
 	 W[0] = PreW18 + (u)(rotr25(nonce.x), rotr25(nonce.x) ^ 0x2004000U, rotr25(nonce.x) ^ 0x4008000U, rotr25(nonce.x) ^ 0x600c000U);
 #elif defined VECTORS2
 	 W[0] = PreW18 + (u)(rotr25(nonce.x), rotr25(nonce.x) ^ 0x2004000U);
@@ -141,8 +135,8 @@ __kernel
 	W[14] = W[7] + PreW32 + rotr15(W[12]);
 	W[15] = W[8] + W17 + rotr15(W[13]) + rotr25(W[0]);
 
-	V[1] += 0x0fc19dc6U + V[5] + W[0] + ch(V[2], V[3], V[4]) + rotr26(V[2]);
-	V[5] =  0x0fc19dc6U + V[5] + W[0] + ch(V[2], V[3], V[4]) + rotr26(V[2]) + rotr30(V[6]) + ma(V[7], V[0], V[6]);
+	V[1] += 0x0fc19dc6U + V[5] + ch(V[2], V[3], V[4]) + rotr26(V[2]) + W[0];
+	V[5] =  0x0fc19dc6U + V[5] + ch(V[2], V[3], V[4]) + rotr26(V[2]) + W[0] + rotr30(V[6]) + ma(V[7], V[0], V[6]);
 
 	V[0] += 0x240ca1ccU + V[4] + W[1] + ch(V[1], V[2], V[3]) + rotr26(V[1]);
 	V[4] =  0x240ca1ccU + V[4] + W[1] + ch(V[1], V[2], V[3]) + rotr26(V[1]) + rotr30(V[5]) + ma(V[6], V[7], V[5]);
@@ -571,59 +565,15 @@ __kernel
 
 	V[7] += V[3] + W[12] + ch(V[0], V[1], V[2]) + rotr26(V[0]);
 
-
 #define FOUND (0x80)
 #define NFLAG (0x7F)
 
-#ifdef VECTORS8
-	V[7] ^= 0x136032edU;
-
-	bool result = V[7].s0 & V[7].s1 & V[7].s2 & V[7].s3 & V[7].s4 & V[7].s5 & V[7].s6 & V[7].s7;
-
-	if (!result) {
-		if (!V[7].s0)
-			output[FOUND] = output[NFLAG & nonce.s0] = nonce.s0;
-		if (!V[7].s1)
-			output[FOUND] = output[NFLAG & nonce.s1] = nonce.s1;
-		if (!V[7].s2)
-			output[FOUND] = output[NFLAG & nonce.s2] = nonce.s2;
-		if (!V[7].s3)
-			output[FOUND] = output[NFLAG & nonce.s3] = nonce.s3;
-		if (!V[7].s4)
-			output[FOUND] = output[NFLAG & nonce.s4] = nonce.s4;
-		if (!V[7].s5)
-			output[FOUND] = output[NFLAG & nonce.s5] = nonce.s5;
-		if (!V[7].s6)
-			output[FOUND] = output[NFLAG & nonce.s6] = nonce.s6;
-		if (!V[7].s7)
-			output[FOUND] = output[NFLAG & nonce.s7] = nonce.s7;
-	}
-#elif defined VECTORS4
-	V[7] ^= 0x136032edU;
-
-	bool result = V[7].x & V[7].y & V[7].z & V[7].w;
-
-	if (!result) {
-		if (!V[7].x)
-			output[FOUND] = output[NFLAG & nonce.x] = nonce.x;
-		if (!V[7].y)
-			output[FOUND] = output[NFLAG & nonce.y] = nonce.y;
-		if (!V[7].z)
-			output[FOUND] = output[NFLAG & nonce.z] = nonce.z;
-		if (!V[7].w)
-			output[FOUND] = output[NFLAG & nonce.w] = nonce.w;
-	}
+#ifdef VECTORS4
+	if ((V[7].x == 0x136032edU) ^ (V[7].y == 0x136032edU) ^ (V[7].z == 0x136032edU) ^ (V[7].w == 0x136032edU))
+		output[FOUND] = output[NFLAG & nonce.x] = (V[7].x == 0x136032edU) ? nonce.x : ((V[7].y == 0x136032edU) ? nonce.y : ((V[7].z == 0x136032edU) ? nonce.z : nonce.w));
 #elif defined VECTORS2
-	V[7] ^= 0x136032edU;
-
-	bool result = V[7].x & V[7].y;
-
-	if (!result) {
-		if (!V[7].x)
-			output[FOUND] = output[NFLAG & nonce.x] = nonce.x;
-		if (!V[7].y)
-			output[FOUND] = output[NFLAG & nonce.y] = nonce.y;
-	}
+	if ((V[7].x == 0x136032edU) + (V[7].y == 0x136032edU))
+		output[FOUND] = output[NFLAG & nonce.x] = (V[7].x == 0x136032edU) ? nonce.x : nonce.y;
 #else
 	if (V[7] == 0x136032edU)
 		output[FOUND] = output[NFLAG & nonce] = nonce;

+ 10 - 2
driver-bitforce.c

@@ -206,13 +206,17 @@ static void bitforce_detect_auto()
 static void bitforce_detect()
 {
 	struct string_elist *iter, *tmp;
+	const char*s;
 	bool found = false;
 	bool autoscan = false;
 
 	list_for_each_entry_safe(iter, tmp, &scan_devices, list) {
-		if (!strcmp(iter->string, "auto"))
+		s = iter->string;
+		if (!strncmp("bitforce:", iter->string, 9))
+			s += 9;
+		if (!strcmp(s, "auto"))
 			autoscan = true;
-		else if (bitforce_detect_one(iter->string)) {
+		else if (bitforce_detect_one(s)) {
 			string_elist_del(iter);
 			found = true;
 		}
@@ -308,6 +312,10 @@ static uint64_t bitforce_scanhash(struct thr_info *thr, struct work *work, uint6
 			if (temp > bitforce->cutofftemp) {
 				applog(LOG_WARNING, "Hit thermal cutoff limit on %s %d, disabling!", bitforce->api->name, bitforce->device_id);
 				bitforce->deven = DEV_RECOVER;
+
+				bitforce->device_last_not_well = time(NULL);
+				bitforce->device_not_well_reason = REASON_DEV_THERMAL_CUTOFF;
+				bitforce->dev_thermal_cutoff_count++;
 			}
 		}
 	}

+ 1 - 1
driver-cpu.c

@@ -39,7 +39,7 @@
 	#include <fcntl.h>
 #endif
 
-#ifdef __linux /* Linux specific policy and affinity management */
+#if defined(__linux) && defined(cpu_set_t) /* Linux specific policy and affinity management */
 #include <sched.h>
 static inline void drop_policy(void)
 {

+ 10 - 3
driver-icarus.c

@@ -100,7 +100,10 @@ static int icarus_open(const char *devpath)
 				    NULL, OPEN_EXISTING, 0, NULL);
 	if (unlikely(hSerial == INVALID_HANDLE_VALUE))
 		return -1;
-	/* TODO: Needs setup read block time. just like VTIME = 10 */
+
+	COMMTIMEOUTS cto = {1000, 0, 1000, 0, 1000};
+	SetCommTimeouts(hSerial, &cto);
+
 	return _open_osfhandle((LONG)hSerial, 0);
 #endif
 }
@@ -120,7 +123,7 @@ static int icarus_gets(unsigned char *buf, size_t bufLen, int fd)
 
 		rc++;
 		if (rc == ICARUS_READ_FAULT_COUNT) {
-			applog(LOG_WARNING,
+			applog(LOG_DEBUG,
 			       "Icarus Read: No data in %d seconds", rc);
 			return 1;
 		}
@@ -204,9 +207,13 @@ static bool icarus_detect_one(const char *devpath)
 static void icarus_detect()
 {
 	struct string_elist *iter, *tmp;
+	const char*s;
 
 	list_for_each_entry_safe(iter, tmp, &scan_devices, list) {
-		if (icarus_detect_one(iter->string))
+		s = iter->string;
+		if (!strncmp("icarus:", iter->string, 7))
+			s += 7;
+		if (icarus_detect_one(s))
 			string_elist_del(iter);
 	}
 }

+ 48 - 17
driver-opencl.c

@@ -11,7 +11,10 @@
 
 #include "config.h"
 
+#ifdef HAVE_CURSES
 #include <curses.h>
+#endif
+
 #include <string.h>
 #include <stdbool.h>
 #include <stdint.h>
@@ -32,8 +35,10 @@
 
 /* TODO: cleanup externals ********************/
 
+#ifdef HAVE_CURSES
 extern WINDOW *mainwin, *statuswin, *logwin;
 extern void enable_curses(void);
+#endif
 
 extern int mining_threads;
 extern double total_secs;
@@ -526,6 +531,9 @@ void pause_dynamic_threads(int gpu)
 
 struct device_api opencl_api;
 
+#endif /* HAVE_OPENCL */
+
+#if defined(HAVE_OPENCL) && defined(HAVE_CURSES)
 void manage_gpu(void)
 {
 	struct thr_info *thr;
@@ -743,10 +751,8 @@ static _clState *clStates[MAX_GPUDEVICES];
 static cl_int queue_poclbm_kernel(_clState *clState, dev_blk_ctx *blk, cl_uint threads)
 {
 	cl_kernel *kernel = &clState->kernel;
-	cl_uint vwidth = clState->vwidth;
-	unsigned int i, num = 0;
+	unsigned int num = 0;
 	cl_int status = 0;
-	uint *nonces;
 
 	CL_SET_BLKARG(ctx_a);
 	CL_SET_BLKARG(ctx_b);
@@ -765,10 +771,15 @@ static cl_int queue_poclbm_kernel(_clState *clState, dev_blk_ctx *blk, cl_uint t
 	CL_SET_BLKARG(cty_g);
 	CL_SET_BLKARG(cty_h);
 
-	nonces = alloca(sizeof(uint) * vwidth);
-	for (i = 0; i < vwidth; i++)
-		nonces[i] = blk->nonce + (i * threads);
-	CL_SET_VARG(vwidth, nonces);
+	if (!clState->goffset) {
+		cl_uint vwidth = clState->vwidth;
+		uint *nonces = alloca(sizeof(uint) * vwidth);
+		unsigned int i;
+
+		for (i = 0; i < vwidth; i++)
+			nonces[i] = blk->nonce + (i * threads);
+		CL_SET_VARG(vwidth, nonces);
+	}
 
 	CL_SET_BLKARG(fW0);
 	CL_SET_BLKARG(fW1);
@@ -777,7 +788,6 @@ static cl_int queue_poclbm_kernel(_clState *clState, dev_blk_ctx *blk, cl_uint t
 	CL_SET_BLKARG(fW15);
 	CL_SET_BLKARG(fW01r);
 
-	CL_SET_BLKARG(fcty_e2);
 	CL_SET_BLKARG(D1A);
 	CL_SET_BLKARG(C1addK5);
 	CL_SET_BLKARG(B1addK6);
@@ -897,15 +907,19 @@ static cl_int queue_diakgcn_kernel(_clState *clState, dev_blk_ctx *blk,
 static cl_int queue_diablo_kernel(_clState *clState, dev_blk_ctx *blk, cl_uint threads)
 {
 	cl_kernel *kernel = &clState->kernel;
-	cl_uint vwidth = clState->vwidth;
-	unsigned int i, num = 0;
+	unsigned int num = 0;
 	cl_int status = 0;
-	uint *nonces;
 
-	nonces = alloca(sizeof(uint) * vwidth);
-	for (i = 0; i < vwidth; i++)
-		nonces[i] = blk->nonce + (i * threads);
-	CL_SET_VARG(vwidth, nonces);
+	if (!clState->goffset) {
+		cl_uint vwidth = clState->vwidth;
+		uint *nonces = alloca(sizeof(uint) * vwidth);
+		unsigned int i;
+
+		for (i = 0; i < vwidth; i++)
+			nonces[i] = blk->nonce + (i * threads);
+		CL_SET_VARG(vwidth, nonces);
+	}
+
 
 	CL_SET_BLKARG(PreVal0);
 	CL_SET_BLKARG(PreVal0addK7);
@@ -1178,14 +1192,21 @@ static bool opencl_thread_prepare(struct thr_info *thr)
 			applog(LOG_ERR, "Restarting the GPU from the menu will not fix this.");
 			applog(LOG_ERR, "Try restarting cgminer.");
 			failmessage = true;
+#ifdef HAVE_CURSES
 			if (use_curses) {
 				buf = curses_input("Press enter to continue");
 				if (buf)
 					free(buf);
 			}
+#endif
 		}
 		cgpu->deven = DEV_DISABLED;
 		cgpu->status = LIFE_NOSTART;
+
+		cgpu->device_last_not_well = time(NULL);
+		cgpu->device_not_well_reason = REASON_DEV_NOSTART;
+		cgpu->dev_nostart_count++;
+
 		return false;
 	}
 	if (name && !cgpu->name)
@@ -1264,6 +1285,8 @@ static bool opencl_thread_init(struct thr_info *thr)
 
 	gpu->status = LIFE_WELL;
 
+	gpu->device_last_well = time(NULL);
+
 	return true;
 }
 
@@ -1359,8 +1382,16 @@ static uint64_t opencl_scanhash(struct thr_info *thr, struct work *work,
 		memset(thrdata->res, 0, BUFFERSIZE);
 		clFinish(clState->commandQueue);
 	}
-	status = clEnqueueNDRangeKernel(clState->commandQueue, *kernel, 1, NULL,
-			globalThreads, localThreads, 0,  NULL, NULL);
+
+	if (clState->goffset) {
+		size_t global_work_offset[1];
+
+		global_work_offset[0] = work->blk.nonce;
+		status = clEnqueueNDRangeKernel(clState->commandQueue, *kernel, 1, global_work_offset,
+						globalThreads, localThreads, 0,  NULL, NULL);
+	} else
+		status = clEnqueueNDRangeKernel(clState->commandQueue, *kernel, 1, NULL,
+						globalThreads, localThreads, 0,  NULL, NULL);
 	if (unlikely(status != CL_SUCCESS)) {
 		applog(LOG_ERR, "Error: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)");
 		return 0;

+ 23 - 20
logging.c

@@ -18,9 +18,29 @@ bool opt_log_output = false;
 /* per default priorities higher than LOG_NOTICE are logged */
 int opt_log_level = LOG_NOTICE;
 
-void vapplog(int prio, const char *fmt, va_list ap)
+static void my_log_curses(int prio, char *f, va_list ap)
 {
+#ifdef HAVE_CURSES
 	extern bool use_curses;
+	if (use_curses)
+		log_curses(prio, f, ap);
+	else
+#endif
+	{
+		int len = strlen(f);
+
+		strcpy(f + len - 1, "                    \n");
+
+#ifdef HAVE_CURSES
+		log_curses(prio, f, ap);
+#else
+		vprintf(f, ap);
+#endif
+	}
+}
+
+void vapplog(int prio, const char *fmt, va_list ap)
+{
 	if (!opt_debug && prio == LOG_DEBUG)
 		return;
 
@@ -60,15 +80,7 @@ void vapplog(int prio, const char *fmt, va_list ap)
 			fflush(stderr);
 		}
 
-		if (use_curses)
-			log_curses(prio, f, ap);
-		else {
-			int len = strlen(f);
-
-			strcpy(f + len - 1, "                    \n");
-
-			log_curses(prio, f, ap);
-		}
+		my_log_curses(prio, f, ap);
 	}
 }
 
@@ -90,7 +102,6 @@ void applog(int prio, const char *fmt, ...)
  */
 static void __maybe_unused log_generic(int prio, const char *fmt, va_list ap)
 {
-	extern bool use_curses;
 #ifdef HAVE_SYSLOG_H
 	if (use_syslog) {
 		vsyslog(prio, fmt, ap);
@@ -127,15 +138,7 @@ static void __maybe_unused log_generic(int prio, const char *fmt, va_list ap)
 			fflush(stderr);
 		}
 
-		if (use_curses)
-			log_curses(prio, f, ap);
-		else {
-			int len = strlen(f);
-
-			strcpy(f + len - 1, "                    \n");
-
-			log_curses(prio, f, ap);
-		}
+		my_log_curses(prio, f, ap);
 	}
 }
 /* we can not generalize variable argument list */

+ 37 - 1
miner.h

@@ -222,6 +222,28 @@ enum cl_kernels {
 	KL_DIABLO,
 };
 
+enum dev_reason {
+	REASON_THREAD_FAIL_INIT,
+	REASON_THREAD_ZERO_HASH,
+	REASON_THREAD_FAIL_QUEUE,
+	REASON_DEV_SICK_IDLE_60,
+	REASON_DEV_DEAD_IDLE_600,
+	REASON_DEV_NOSTART,
+	REASON_DEV_OVER_HEAT,
+	REASON_DEV_THERMAL_CUTOFF,
+};
+
+#define REASON_NONE			"None"
+#define REASON_THREAD_FAIL_INIT_STR	"Thread failed to init"
+#define REASON_THREAD_ZERO_HASH_STR	"Thread got zero hashes"
+#define REASON_THREAD_FAIL_QUEUE_STR	"Thread failed to queue work"
+#define REASON_DEV_SICK_IDLE_60_STR	"Device idle for 60s"
+#define REASON_DEV_DEAD_IDLE_600_STR	"Device dead - idle for 600s"
+#define REASON_DEV_NOSTART_STR		"Device failed to start"
+#define REASON_DEV_OVER_HEAT_STR	"Device over heated"
+#define REASON_DEV_THERMAL_CUTOFF_STR	"Device reached thermal cutoff"
+#define REASON_UNKNOWN_STR		"Unknown reason - code bug"
+
 struct cgpu_info {
 	int cgminer_id;
 	struct device_api *api;
@@ -275,6 +297,18 @@ struct cgpu_info {
 #endif
 	int last_share_pool;
 	time_t last_share_pool_time;
+
+	time_t device_last_well;
+	time_t device_last_not_well;
+	enum dev_reason device_not_well_reason;
+	int thread_fail_init_count;
+	int thread_zero_hash_count;
+	int thread_fail_queue_count;
+	int dev_sick_idle_60_count;
+	int dev_dead_idle_600_count;
+	int dev_nostart_count;
+	int dev_over_heat_count;	// It's a warning but worth knowing
+	int dev_thermal_cutoff_count;
 };
 
 extern bool add_cgpu(struct cgpu_info*);
@@ -478,7 +512,7 @@ extern int add_pool_details(bool live, char *url, char *user, char *pass);
 #define ADD_POOL_OK 0
 
 #define MAX_GPUDEVICES 16
-#define MAX_DEVICES 32
+#define MAX_DEVICES 64
 #define MAX_POOLS (32)
 
 #define MIN_INTENSITY -10
@@ -616,6 +650,7 @@ extern int curses_int(const char *query);
 extern char *curses_input(const char *query);
 extern void kill_work(void);
 extern void switch_pools(struct pool *selected);
+extern void remove_pool(struct pool *pool);
 extern void write_config(FILE *fcfg);
 extern void log_curses(int prio, const char *f, va_list ap);
 extern void clear_logwin(void);
@@ -628,5 +663,6 @@ extern void tq_freeze(struct thread_q *tq);
 extern void tq_thaw(struct thread_q *tq);
 extern bool successful_connect;
 extern void adl(void);
+extern void app_restart(void);
 
 #endif /* __MINER_H__ */

+ 463 - 92
miner.php

@@ -1,30 +1,79 @@
 <?php
 session_start();
 #
-global $miner, $port;
-$miner = '127.0.0.1'; # hostname or IP address
-$port = 4028;
+global $miner, $port, $readonly, $notify, $rigs;
+#
+# Don't touch these 2 - see $rigs below
+$miner = null;
+$port = null;
+#
+# Set $readonly to true to force miner.php to be readonly
+# Set $readonly to false then it will check cgminer 'privileged'
+$readonly = false;
+#
+# Set $notify to false to NOT attempt to display the notify command
+# Set $notify to true to attempt to display the notify command
+# If your older version of cgminer returns an 'Invalid command'
+#  coz it doesn't have notify - it just shows the error status table
+$notify = true;
+#
+# Set $rigs to an array of your cgminer rigs that are running
+#  format: 'IP:Port' or 'Host:Port'
+# If you only have one rig, it will just show the detail of that rig
+# If you have more than one rig it will show a summary of all the rigs
+#  with buttons to show the details of each rig
+# e.g. $rigs = array('127.0.0.1:4028','myrig.com:4028');
+$rigs = array('127.0.0.1:4028');
 #
 $here = $_SERVER['PHP_SELF'];
 #
+global $tablebegin, $tableend, $warnfont, $warnoff;
+$tablebegin = '<tr><td><table border=1 cellpadding=5 cellspacing=0>';
+$tableend = '</table></td></tr>';
+$warnfont = '<font color=red><b>';
+$warnoff = '</b></font>';
+
+#
+function htmlhead($checkapi)
+{
+ global $error, $readonly, $here;
+ if ($readonly === false && $checkapi === true)
+ {
+	$access = api('privileged');
+	if ($error != null
+	||  !isset($access['STATUS']['STATUS'])
+	||  $access['STATUS']['STATUS'] != 'S')
+		$readonly = true;
+ }
 ?>
 <html><head><title>Mine</title>
 <style type='text/css'>
 td { color:blue; font-family:verdana,arial,sans; font-size:13pt; }
 td.h { color:blue; font-family:verdana,arial,sans; font-size:13pt; background:#d0ffff }
+td.err { color:black; font-family:verdana,arial,sans; font-size:13pt; background:#ff3050 }
+td.warn { color:black; font-family:verdana,arial,sans; font-size:13pt; background:#ffb050 }
 td.sta { color:green; font-family:verdana,arial,sans; font-size:13pt; }
+td.tot { color:blue; font-family:verdana,arial,sans; font-size:13pt; background:#fff8f2 }
 </style>
 </head><body bgcolor=#ecffff>
 <script type='text/javascript'>
 function pr(a,m){if(m!=null){if(!confirm(m+'?'))return}window.location="<?php echo $here ?>"+a}
+<?php
+ if ($readonly === false && $checkapi === true)
+ {
+?>
 function prc(a,m){pr('?arg='+a,m)}
-function prs(a){var c=a.substr(3);var z=c.split('|',2);var m=z[0].substr(0,1).toUpperCase()+z[0].substr(1)+' GPU '+z[1];prc(a,m)}
-function prs2(a,n){var v=document.getElementById('gi'+n).value;var c=a.substr(3);var z=c.split('|',2);var m='Set GPU '+z[1]+' '+z[0].substr(0,1).toUpperCase()+z[0].substr(1)+' to '+v;prc(a+','+v,m)}
+function prs(a,r){var c=a.substr(3);var z=c.split('|',2);var m=z[0].substr(0,1).toUpperCase()+z[0].substr(1)+' GPU '+z[1];prc(a+'&rig='+r,m)}
+function prs2(a,n,r){var v=document.getElementById('gi'+n).value;var c=a.substr(3);var z=c.split('|',2);var m='Set GPU '+z[1]+' '+z[0].substr(0,1).toUpperCase()+z[0].substr(1)+' to '+v;prc(a+','+v+'&rig='+r,m)}
+<?php
+ }
+?>
 </script>
 <table width=100% height=100% border=0 cellpadding=0 cellspacing=0 summary='Mine'>
 <tr><td align=center valign=top>
 <table border=0 cellpadding=4 cellspacing=0 summary='Mine'>
 <?php
+}
 #
 global $error;
 $error = null;
@@ -154,121 +203,214 @@ function getparam($name, $both = false)
 #
 function fmt($section, $name, $value)
 {
+ $errorclass = ' class=err';
+ $warnclass = ' class=warn';
  $b = '&nbsp;';
 
+ $ret = $value;
+ $class = '';
+
  switch ($section.'.'.$name)
  {
- case 'GPU0.Last Share Time':
-	return date('H:i:s', $value);
+ case 'GPU.Last Share Time':
+ case 'PGA.Last Share Time':
+	$ret = date('H:i:s', $value);
 	break;
  case 'SUMMARY.Elapsed':
 	$s = $value % 60;
 	$value -= $s;
 	$value /= 60;
 	if ($value == 0)
-	{
-		return $s.'s';
-	}
+		$ret = $s.'s';
 	else
 	{
 		$m = $value % 60;
 		$value -= $m;
 		$value /= 60;
 		if ($value == 0)
-		{
-			return sprintf("%dm$b%02ds", $m, $s);
-		}
+			$ret = sprintf("%dm$b%02ds", $m, $s);
 		else
 		{
 			$h = $value % 24;
 			$value -= $h;
 			$value /= 24;
 			if ($value == 0)
-				return sprintf("%dh$b%02dm$b%02ds", $h, $m, $s);
+				$ret = sprintf("%dh$b%02dm$b%02ds", $h, $m, $s);
 			else
-				return sprintf("%ddays$b%02dh$b%02dm$b%02ds", $value, $h, $m, $s);
+			{
+				if ($value == 1)
+					$days = '';
+				else
+					$days = 's';
+
+				$ret = sprintf("%dday$days$b%02dh$b%02dm$b%02ds", $value, $h, $m, $s);
+			}
 		}
 	}
 	break;
- case 'GPU0.Utility':
+ case 'NOTIFY.Last Well':
+	if ($value == '0')
+	{
+		$ret = 'Never';
+		$class = $warnclass;
+	}
+	else
+		$ret = date('H:i:s', $value);
+	break;
+ case 'NOTIFY.Last Not Well':
+	if ($value == '0')
+		$ret = 'Never';
+	else
+	{
+		$ret = date('H:i:s', $value);
+		$class = $errorclass;
+	}
+	break;
+ case 'NOTIFY.Reason Not Well':
+	if ($value != 'None')
+		$class = $errorclass;
+	break;
+ case 'GPU.Utility':
+ case 'PGA.Utility':
  case 'SUMMARY.Utility':
-	return $value.'/m';
+	$ret = $value.'/m';
+	break;
+ case 'PGA.Temperature':
+	$ret = $value.'&deg;C';
+	break;
+ case 'GPU.Temperature':
+	$ret = $value.'&deg;C';
+ case 'GPU.Fan Speed':
+ case 'GPU.Fan Percent':
+ case 'GPU.GPU Clock':
+ case 'GPU.Memory Clock':
+ case 'GPU.GPU Voltage':
+ case 'GPU.GPU Activity':
+	if ($value == 0)
+		$class = $warnclass;
 	break;
- case 'GPU0.Temperature':
-	return $value.'&deg;C';
+ case 'GPU.MHS av':
+ case 'PGA.MHS av':
+ case 'SUMMARY.MHS av':
+ case 'GPU.Total MH':
+ case 'PGA.Total MH':
+ case 'SUMMARY.Total MH':
+ case 'SUMMARY.Getworks':
+ case 'GPU.Accepted':
+ case 'PGA.Accepted':
+ case 'SUMMARY.Accepted':
+ case 'GPU.Rejected':
+ case 'PGA.Rejected':
+ case 'SUMMARY.Rejected':
+ case 'SUMMARY.Local Work':
+ case 'POOL.Getworks':
+ case 'POOL.Accepted':
+ case 'POOL.Rejected':
+ case 'POOL.Discarded':
+	$parts = explode('.', $value, 2);
+	if (count($parts) == 1)
+		$dec = '';
+	else
+		$dec = '.'.$parts[1];
+	$ret = number_format($parts[0]).$dec;
+	break;
+ case 'GPU.Status':
+ case 'PGA.Status':
+ case 'POOL.Status':
+	if ($value != 'Alive')
+		$class = $errorclass;
+	break;
+ case 'GPU.Enabled':
+ case 'PGA.Enabled':
+	if ($value != 'Y')
+		$class = $warnclass;
 	break;
  }
 
- return $value;
+ if ($section == 'NOTIFY' && substr($name, 0, 1) == '*' && $value != '0')
+	$class = $errorclass;
+
+ return array($ret, $class);
 }
 #
-function details($cmd, $list)
+global $poolcmd;
+$poolcmd = array(	'Switch to'	=> 'switchpool',
+			'Enable'	=> 'enablepool',
+			'Disable'	=> 'disablepool' );
+#
+function showhead($cmd, $item, $values)
 {
- $stas = array('S' => 'Success', 'W' => 'Warning', 'I' => 'Informational', 'E' => 'Error', 'F' => 'Fatal');
+ global $poolcmd, $readonly;
 
- $tb = '<tr><td><table border=1 cellpadding=5 cellspacing=0>';
- $te = '</table></td></tr>';
+ echo '<tr>';
 
- echo $tb;
+ foreach ($values as $name => $value)
+ {
+	if ($name == '0')
+		$name = '&nbsp;';
+	echo "<td valign=bottom class=h>$name</td>";
+ }
 
- echo '<tr><td class=sta>Date: '.date('H:i:s j-M-Y \U\T\CP').'</td></tr>';
+ if ($cmd == 'pools' && $readonly === false)
+	foreach ($poolcmd as $name => $pcmd)
+		echo "<td valign=bottom class=h>$name</td>";
 
- echo $te.$tb;
+ echo '</tr>';
+}
+#
+function details($cmd, $list, $rig)
+{
+ global $tablebegin, $tableend;
+ global $poolcmd, $readonly;
+
+ $dfmt = 'H:i:s j-M-Y \U\T\CP';
+
+ $stas = array('S' => 'Success', 'W' => 'Warning', 'I' => 'Informational', 'E' => 'Error', 'F' => 'Fatal');
+
+ echo $tablebegin;
+
+ echo '<tr><td class=sta>Date: '.date($dfmt).'</td></tr>';
+
+ echo $tableend.$tablebegin;
 
  if (isset($list['STATUS']))
  {
 	echo '<tr>';
 	echo '<td>Computer: '.$list['STATUS']['Description'].'</td>';
+	if (isset($list['STATUS']['When']))
+		echo '<td>When: '.date($dfmt, $list['STATUS']['When']).'</td>';
 	$sta = $list['STATUS']['STATUS'];
 	echo '<td>Status: '.$stas[$sta].'</td>';
 	echo '<td>Message: '.$list['STATUS']['Msg'].'</td>';
 	echo '</tr>';
  }
 
- echo $te.$tb;
 
  $section = '';
 
- $poolcmd = array(	'Switch to'	=> 'switchpool',
-			'Enable'	=> 'enablepool',
-			'Disable'	=> 'disablepool' );
-
  foreach ($list as $item => $values)
  {
-	if ($item != 'STATUS')
-	{
-		$section = $item;
-
-		echo '<tr>';
-
-		foreach ($values as $name => $value)
-		{
-			if ($name == '0')
-				$name = '&nbsp;';
-			echo "<td valign=bottom class=h>$name</td>";
-		}
-
-		if ($cmd == 'pools')
-			foreach ($poolcmd as $name => $pcmd)
-				echo "<td valign=bottom class=h>$name</td>";
+	if ($item == 'STATUS')
+		continue;
 
-		echo '</tr>';
+	$sectionname = preg_replace('/\d/', '', $item);
 
-		break;
+	if ($sectionname != $section)
+	{
+		echo $tableend.$tablebegin;
+		showhead($cmd, $item, $values);
+		$section = $sectionname;
 	}
- }
-
- foreach ($list as $item => $values)
- {
-	if ($item == 'STATUS')
-		continue;
 
 	echo '<tr>';
 
 	foreach ($values as $name => $value)
-		echo '<td>'.fmt($section, $name, $value).'</td>';
+	{
+		list($showvalue, $class) = fmt($section, $name, $value);
+		echo "<td$class>$showvalue</td>";
+	}
 
-	if ($cmd == 'pools')
+	if ($cmd == 'pools' && $readonly === false)
 	{
 		reset($values);
 		$pool = current($values);
@@ -280,7 +422,7 @@ function details($cmd, $list)
 			else
 			{
 				echo "<input type=button value='Pool $pool'";
-				echo " onclick='prc(\"$pcmd|$pool\",\"$name Pool $pool\")'>";
+				echo " onclick='prc(\"$pcmd|$pool&rig=$rig\",\"$name Pool $pool\")'>";
 			}
 			echo '</td>';
 		}
@@ -288,14 +430,16 @@ function details($cmd, $list)
 
 	echo '</tr>';
  }
- echo $te;
+
+ echo $tableend;
 }
 #
 global $devs;
 $devs = null;
 #
-function gpubuttons($count, $info)
+function gpubuttons($count, $rig)
 {
+ global $tablebegin, $tableend;
  global $devs;
 
  $basic = array( 'GPU', 'Enable', 'Disable', 'Restart' );
@@ -306,10 +450,7 @@ function gpubuttons($count, $info)
 			'mem' => 'Memory Clock',
 			'vddc' => 'GPU Voltage' );
 
- $tb = '<tr><td><table border=1 cellpadding=5 cellspacing=0>';
- $te = '</table></td></tr>';
-
- echo $tb.'<tr>';
+ echo $tablebegin.'<tr>';
 
  foreach ($basic as $head)
 	echo "<td>$head</td>";
@@ -332,7 +473,7 @@ function gpubuttons($count, $info)
 		{
 			echo "<input type=button value='$name $c' onclick='prs(\"gpu";
 			echo strtolower($name);
-			echo "|$c\")'>";
+			echo "|$c\",$rig)'>";
 		}
 
 		echo '</td>';
@@ -346,7 +487,7 @@ function gpubuttons($count, $info)
 		else
 		{
 			$value = $devs["GPU$c"][$des];
-			echo "<input type=button value='Set $c:' onclick='prs2(\"gpu$name|$c\",$n)'>";
+			echo "<input type=button value='Set $c:' onclick='prs2(\"gpu$name|$c\",$n,$rig)'>";
 			echo "<input size=7 type=text name=gi$n value='$value' id=gi$n>";
 			$n++;
 		}
@@ -356,35 +497,37 @@ function gpubuttons($count, $info)
 
  }
 
- echo '</tr>'.$te;
+ echo '</tr>'.$tableend;
 }
 #
-function processgpus($rd, $ro)
+function processgpus($rig)
 {
  global $error;
+ global $warnfont, $warnoff;
 
  $gpus = api('gpucount');
 
  if ($error != null)
-	echo '<tr><td>Error getting GPU count: '.$rd.$error.$ro.'</td></tr>';
+	echo '<tr><td>Error getting GPU count: '.$warnfont.$error.$warnoff.'</td></tr>';
  else
  {
 	if (!isset($gpus['GPUS']['Count']))
-		echo '<tr><td>No GPU count returned: '.$rd.$gpus['STATUS']['STATUS'].' '.$gpus['STATUS']['Msg'].$ro.'</td></tr>';
+		echo '<tr><td>No GPU count returned: '.$warnfont.$gpus['STATUS']['STATUS'].' '.$gpus['STATUS']['Msg'].$ro.'</td></tr>';
 	else
 	{
 		$count = $gpus['GPUS']['Count'];
 		if ($count == 0)
 			echo '<tr><td>No GPUs</td></tr>';
 		else
-			gpubuttons($count);
+			gpubuttons($count, $rig);
 	}
  }
 }
 #
-function process($cmds, $rd, $ro)
+function process($cmds, $rig)
 {
  global $error, $devs;
+ global $warnfont, $warnoff;
 
  foreach ($cmds as $cmd => $des)
  {
@@ -392,13 +535,13 @@ function process($cmds, $rd, $ro)
 
 	if ($error != null)
 	{
-		echo "<tr><td>Error getting $des: ";
-		echo $rd.$error.$ro.'</td></tr>';
+		echo "<tr><td colspan=100>Error getting $des: ";
+		echo $warnfont.$error.$warnoff.'</td></tr>';
 		break;
 	}
 	else
 	{
-		details($cmd, $process);
+		details($cmd, $process, $rig);
 		echo '<tr><td><br><br></td></tr>';
 		if ($cmd == 'devs')
 			$devs = $process;
@@ -406,34 +549,262 @@ function process($cmds, $rd, $ro)
  }
 }
 #
-function display()
+# $head is a hack but this is just a demo anyway :)
+function doforeach($cmd, $des, $sum, $head)
 {
- global $error;
+ global $miner, $port;
+ global $error, $readonly, $notify, $rigs;
+ global $tablebegin, $tableend, $warnfont, $warnoff;
 
- $error = null;
+ $header = $head;
+ $anss = array();
 
- $rd = '<font color=red><b>';
- $ro = '</b></font>';
+ $count = 0;
+ foreach ($rigs as $rig)
+ {
+	$parts = explode(':', $rig, 2);
+	if (count($parts) == 2)
+	{
+		$miner = $parts[0];
+		$port = $parts[1];
+
+		$ans = api($cmd);
+
+		if ($error != null)
+		{
+			echo "<tr><td colspan=100>Error on rig $count getting $des: ";
+			echo $warnfont.$error.$warnoff.'</td></tr>';
+			$error = null;
+		}
+		else
+			$anss[$count] = $ans;
+	}
+	$count++;
+ }
+
+ if (count($anss) == 0)
+ {
+	echo "<tr><td>Failed to access any rigs successfully</td></tr>";
+	return;
+ }
+
+ $total = array();
+
+ foreach ($anss as $rig => $ans)
+ {
+	foreach ($ans as $item => $row)
+	{
+		if ($item == 'STATUS')
+			continue;
+
+		if (count($row) > count($header))
+		{
+			$header = $head;
+			foreach ($row as $name => $value)
+				if (!isset($header[$name]))
+					$header[$name] = '';
+		}
+
+		if ($sum != null)
+			foreach ($sum as $name)
+			{
+				if (isset($row[$name]))
+				{
+					if (isset($total[$name]))
+						$total[$name] += $row[$name];
+					else
+						$total[$name] = $row[$name];
+				}
+			}
+	}
+ }
+
+ if ($sum != null)
+	$anss['total']['total'] = $total;
+
+ showhead('', null, $header);
+
+ $section = '';
+
+ foreach ($anss as $rig => $ans)
+ {
+	foreach ($ans as $item => $row)
+	{
+		if ($item == 'STATUS')
+			continue;
+
+		echo '<tr>';
+
+		$newsection = preg_replace('/\d/', '', $item);
+		if ($newsection != 'total')
+			$section = $newsection;
+
+		foreach ($header as $name => $x)
+		{
+			if ($name == '')
+			{
+				if ($rig === 'total')
+					echo "<td align=right class=tot>Total:</td>";
+				else
+					echo "<td align=right><input type=button value='Rig $rig' onclick='pr(\"?rig=$rig\",null)'></td>";
+			}
+			else
+			{
+				if (isset($row[$name]))
+					list($showvalue, $class) = fmt($section, $name, $row[$name]);
+				else
+				{
+					$class = '';
+					$showvalue = '&nbsp;';
+				}
+
+				if ($rig === 'total' and $class == '')
+					$class = ' class=tot';
+
+				echo "<td$class align=right>$showvalue</td>";
+			}
+		}
+
+		echo '</tr>';
+	}
+ }
+}
+#
+function doOne($rig, $preprocess)
+{
+ global $error, $readonly, $notify;
+ global $rigs;
+
+ htmlhead(true);
+
+ $error = null;
 
  echo "<tr><td><table cellpadding=0 cellspacing=0 border=0><tr><td>";
- echo "<input type=button value='Refresh' onclick='pr(\"\",null)'>";
- echo "</td><td width=100%>&nbsp;</td><td>";
- echo "<input type=button value='Quit' onclick='prc(\"quit\",\"Quit CGMiner\")'>";
+ echo "<input type=button value='Refresh' onclick='pr(\"?rig=$rig\",null)'></td>";
+ if (count($rigs) > 1)
+	echo "<td><input type=button value='Summary' onclick='pr(\"\",null)'></td>";
+ echo "<td width=100%>&nbsp;</td><td>";
+ if ($readonly === false)
+ {
+	$msg = 'Quit CGMiner';
+	if (count($rigs) > 1)
+		$msg .= " Rig $rig";
+	echo "<input type=button value='Quit' onclick='prc(\"quit&rig=$rig\",\"$msg\")'>";
+ }
  echo "</td></tr></table></td></tr>";
 
- $arg = trim(getparam('arg', true));
- if ($arg != null and $arg != '')
-	process(array($arg => $arg), $rd, $ro);
+ if ($preprocess != null)
+	process(array($preprocess => $preprocess), $rig);
 
  $cmds = array(	'devs'    => 'device list',
 		'summary' => 'summary information',
-		'pools'   => 'pool list',
-		'config'  => 'cgminer config');
+		'pools'   => 'pool list');
+
+ if ($notify)
+	$cmds['notify'] = 'device status';
+
+ $cmds['config'] = 'cgminer config';
+
+ process($cmds, $rig);
+
+ if ($error == null && $readonly === false)
+	processgpus($rig);
+}
+#
+function display()
+{
+ global $tablebegin, $tableend;
+ global $miner, $port;
+ global $error, $readonly, $notify, $rigs;
+
+ $rig = trim(getparam('rig', true));
+
+ $arg = trim(getparam('arg', true));
+ $preprocess = null;
+ if ($arg != null and $arg != '')
+ {
+	$num = null;
+	if ($rig != null and $rig != '')
+	{
+		if ($rig >= 0 and $rig < count($rigs))
+			$num = $rig;
+	}
+	else
+		if (count($rigs) == 0)
+			$num = 0;
+
+	if ($num != null)
+	{
+		$parts = explode(':', $rigs[$num], 2);
+		if (count($parts) == 2)
+		{
+			$miner = $parts[0];
+			$port = $parts[1];
+
+			$preprocess = $arg;
+		}
+	}
+ }
+
+ if ($rigs == null or count($rigs) == 0)
+ {
+	echo "<tr><td>No rigs defined</td></tr>";
+	return;
+ }
+
+ if (count($rigs) == 1)
+ {
+	$parts = explode(':', $rigs[0], 2);
+	if (count($parts) == 2)
+	{
+		$miner = $parts[0];
+		$port = $parts[1];
+
+		doOne(0, $preprocess);
+	}
+	else
+		echo '<tr><td>Invalid "$rigs" array</td></tr>';
+
+	return;
+ }
 
- process($cmds, $rd, $ro);
+ if ($rig != null and $rig != '' and $rig >= 0 and $rig < count($rigs))
+ {
+	$parts = explode(':', $rigs[$rig], 2);
+	if (count($parts) == 2)
+	{
+		$miner = $parts[0];
+		$port = $parts[1];
+
+		doOne($rig, $preprocess);
+	}
+	else
+		echo '<tr><td>Invalid "$rigs" array</td></tr>';
+
+	return;
+ }
+
+ htmlhead(false);
+
+ echo "<tr><td><table cellpadding=0 cellspacing=0 border=0><tr><td>";
+ echo "<input type=button value='Refresh' onclick='pr(\"\",null)'>";
+ echo "</td></tr></table></td></tr>";
 
- if ($error == null)
-	processgpus($rd, $ro);
+ if ($preprocess != null)
+	process(array($preprocess => $preprocess), $rig);
+
+ echo $tablebegin;
+ $sum = array('MHS av', 'Getworks', 'Found Blocks', 'Accepted', 'Rejected', 'Discarded', 'Stale', 'Utility', 'Local Work', 'Total MH');
+ doforeach('summary', 'summary information', $sum, array());
+ echo $tableend;
+ echo '<tr><td><br><br></td></tr>';
+ echo $tablebegin;
+ doforeach('devs', 'device list', $sum, array(''=>'','ID'=>'','Name'=>''));
+ echo $tableend;
+ echo '<tr><td><br><br></td></tr>';
+ echo $tablebegin;
+ doforeach('pools', 'pool list', $sum, array(''=>''));
+ echo $tableend;
 }
 #
 display();

+ 26 - 23
ocl.c

@@ -335,51 +335,46 @@ _clState *initCl(unsigned int gpu, char *name, size_t nameSize)
 	/* Create binary filename based on parameters passed to opencl
 	 * compiler to ensure we only load a binary that matches what would
 	 * have otherwise created. The filename is:
-	 * name + kernelname + v + vectors + w + work_size + l + sizeof(long) + .bin
+	 * name + kernelname +/- g(offset) + v + vectors + w + work_size + l + sizeof(long) + .bin
 	 */
 	char binaryfilename[255];
 	char filename[255];
 	char numbuf[10];
 
 	if (gpus[gpu].kernel == KL_NONE) {
-		if (strstr(vbuff, "844.4") || // Linux 64 bit ATI 2.6 SDK
-		    strstr(vbuff, "851.4") || // Windows 64 bit ""
-		    strstr(vbuff, "831.4")) { // Windows & Linux 32 bit ""
-			if (strstr(name, "Tahiti")) {
-				applog(LOG_INFO, "Selecting poclbm kernel");
-				clState->chosen_kernel = KL_POCLBM;
-			} else {
+		/* Detect all 2.6 SDKs not with Tahiti and use diablo kernel */
+		if (!strstr(name, "Tahiti") &&
+			(strstr(vbuff, "844.4") ||  // Linux 64 bit ATI 2.6 SDK
+			 strstr(vbuff, "851.4") ||  // Windows 64 bit ""
+			 strstr(vbuff, "831.4") ||
+			 strstr(vbuff, "898.1"))) { // 12.2 driver SDK
 				applog(LOG_INFO, "Selecting diablo kernel");
 				clState->chosen_kernel = KL_DIABLO;
-			}
-		} else if (strstr(vbuff, "898.1") || // Windows 64 bit 12.2 driver
-			   strstr(name, "Tahiti")) { // All non SDK 2.6 79x0
-				applog(LOG_INFO, "Selecting diablo kernel");
-				clState->chosen_kernel = KL_DIABLO;
-		} else if (clState->hasBitAlign) {
-			applog(LOG_INFO, "Selecting phatk kernel");
-			clState->chosen_kernel = KL_PHATK;
-		} else {
+		/* Detect all 7970s, older ATI and NVIDIA and use poclbm */
+		} else if (strstr(name, "Tahiti") || !clState->hasBitAlign) {
 			applog(LOG_INFO, "Selecting poclbm kernel");
 			clState->chosen_kernel = KL_POCLBM;
+		/* Use phatk for the rest R5xxx R6xxx */
+		} else {
+			applog(LOG_INFO, "Selecting phatk kernel");
+			clState->chosen_kernel = KL_PHATK;
 		}
-
 		gpus[gpu].kernel = clState->chosen_kernel;
 	} else
 		clState->chosen_kernel = gpus[gpu].kernel;
 
 	/* For some reason 2 vectors is still better even if the card says
 	 * otherwise, and many cards lie about their max so use 256 as max
-	 * unless explicitly set on the command line. */
-	if (preferred_vwidth > 2)
+	 * unless explicitly set on the command line. Tahiti prefers 1 */
+	if (strstr(name, "Tahiti"))
+		preferred_vwidth = 1;
+	else if (preferred_vwidth > 2)
 		preferred_vwidth = 2;
 
 	switch (clState->chosen_kernel) {
 		case KL_POCLBM:
 			strcpy(filename, POCLBM_KERNNAME".cl");
 			strcpy(binaryfilename, POCLBM_KERNNAME);
-			/* This kernel prefers to not use vectors */
-			preferred_vwidth = 1;
 			break;
 		case KL_PHATK:
 			strcpy(filename, PHATK_KERNNAME".cl");
@@ -403,6 +398,10 @@ _clState *initCl(unsigned int gpu, char *name, size_t nameSize)
 		gpus[gpu].vwidth = preferred_vwidth;
 	}
 
+	if ((clState->chosen_kernel == KL_POCLBM || clState->chosen_kernel == KL_DIABLO) &&
+		clState->vwidth == 1 && clState->hasOpenCL11plus)
+			clState->goffset = true;
+
 	if (gpus[gpu].work_size && gpus[gpu].work_size <= clState->max_work_size)
 		clState->wsize = gpus[gpu].work_size;
 	else if (strstr(name, "Tahiti"))
@@ -436,7 +435,8 @@ _clState *initCl(unsigned int gpu, char *name, size_t nameSize)
 	}
 
 	strcat(binaryfilename, name);
-
+	if (clState->goffset)
+		strcat(binaryfilename, "g");
 	strcat(binaryfilename, "v");
 	sprintf(numbuf, "%d", clState->vwidth);
 	strcat(binaryfilename, numbuf);
@@ -538,6 +538,9 @@ build:
 	} else
 		applog(LOG_DEBUG, "BFI_INT patch requiring device not found, will not BFI_INT patch");
 
+	if (clState->goffset)
+		strcat(CompilerOptions, " -D GOFFSET");
+
 	applog(LOG_DEBUG, "CompilerOptions: %s", CompilerOptions);
 	status = clBuildProgram(clState->program, 1, &devices[gpu], CompilerOptions , NULL, NULL);
 	free(CompilerOptions);

+ 1 - 0
ocl.h

@@ -21,6 +21,7 @@ typedef struct {
 	cl_mem outputBuffer;
 	bool hasBitAlign;
 	bool hasOpenCL11plus;
+	bool goffset;
 	cl_uint vwidth;
 	size_t max_work_size;
 	size_t wsize;

+ 0 - 1288
poclbm120222.cl

@@ -1,1288 +0,0 @@
-// -ck modified kernel taken from Phoenix taken from poclbm, with aspects of
-// phatk and others.
-// Modified version copyright 2011-2012 Con Kolivas
-
-// This file is taken and modified from the public-domain poclbm project, and
-// we have therefore decided to keep it public-domain in Phoenix.
-
-#ifdef VECTORS4
-	typedef uint4 u;
-#elif defined VECTORS2
-	typedef uint2 u;
-#else
-	typedef uint u;
-#endif
-
-__constant uint K[64] = { 
-    0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
-    0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
-    0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
-    0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
-    0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
-    0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
-    0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
-    0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
-};
-
-
-// This part is not from the stock poclbm kernel. It's part of an optimization
-// added in the Phoenix Miner.
-
-// Some AMD devices have a BFI_INT opcode, which behaves exactly like the
-// SHA-256 ch function, but provides it in exactly one instruction. If
-// detected, use it for ch. Otherwise, construct ch out of simpler logical
-// primitives.
-
-#ifdef BITALIGN
-	#pragma OPENCL EXTENSION cl_amd_media_ops : enable
-	#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
- #ifdef BFI_INT
-	// Well, slight problem... It turns out BFI_INT isn't actually exposed to
-	// OpenCL (or CAL IL for that matter) in any way. However, there is 
-	// a similar instruction, BYTE_ALIGN_INT, which is exposed to OpenCL via
-	// amd_bytealign, takes the same inputs, and provides the same output. 
-	// We can use that as a placeholder for BFI_INT and have the application 
-	// patch it after compilation.
-	
-	// This is the BFI_INT function
-	#define ch(x, y, z) amd_bytealign(x, y, z)
-	
-	// Ma can also be implemented in terms of BFI_INT...
-	#define Ma(x, y, z) amd_bytealign( (z^x), (y), (x) )
- #else // BFI_INT
-	// Later SDKs optimise this to BFI INT without patching and GCN
-	// actually fails if manually patched with BFI_INT
-
-	#define ch(x, y, z) bitselect((u)z, (u)y, (u)x)
-	#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
-#endif
-#else // BITALIGN
-	#define ch(x, y, z) (z ^ (x & (y ^ z)))
-	#define Ma(x, y, z) ((x & z) | (y & (x | z)))
-	#define rotr(x, y) rotate((u)x, (u)(32 - y))
-#endif
-
-// AMD's KernelAnalyzer throws errors compiling the kernel if we use 
-// amd_bytealign on constants with vectors enabled, so we use this to avoid 
-// problems. (this is used 4 times, and likely optimized out by the compiler.)
-#define Ma2(x, y, z) ((y & z) | (x & (y | z)))
-
-__kernel void search(const uint state0, const uint state1, const uint state2, const uint state3,
-						const uint state4, const uint state5, const uint state6, const uint state7,
-						const uint b1, const uint c1,
-						const uint f1, const uint g1, const uint h1,
-						const u base,
-						const uint fw0, const uint fw1, const uint fw2, const uint fw3, const uint fw15, const uint fw01r,
-						const uint fcty_e2,
-						const uint D1A, const uint C1addK5, const uint B1addK6,
-						const uint W16addK16, const uint W17addK17,
-						const uint PreVal4addT1, const uint Preval0,
-						__global uint * output)
-{
-	u W[24];
-	u *Vals = &W[16]; // Now put at W[16] to be in same array
-
-	const u nonce = base + (uint)(get_global_id(0));
-
-
-Vals[0]=Preval0+nonce;
-
-Vals[3]=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],b1,c1);
-Vals[3]+=D1A;
-
-Vals[7]=Vals[3];
-Vals[7]+=h1;
-Vals[4]=PreVal4addT1+nonce;
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-
-Vals[2]=C1addK5;
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],b1);
-
-Vals[6]=Vals[2];
-Vals[6]+=g1;
-Vals[3]+=Ma2(g1,Vals[4],f1);
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-Vals[1]=B1addK6;
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-
-Vals[5]=Vals[1];
-Vals[5]+=f1;
-Vals[2]+=Ma2(f1,Vals[3],Vals[4]);
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[7];
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[8];
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[9];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[10];
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[11];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[12];
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[13];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[14];
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=0xC19BF3F4U;
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=W16addK16;
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=W17addK17;
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[2]=(rotr(nonce,7)^rotr(nonce,18)^(nonce>>3U));
-W[2]+=fw2;
-Vals[5]+=W[2];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[18];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[3]=nonce;
-W[3]+=fw3;
-Vals[4]+=W[3];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[19];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[4]=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-W[4]+=0x80000000U;
-Vals[3]+=W[4];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[20];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[5]=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=W[5];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[21];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[6]=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-W[6]+=0x00000280U;
-Vals[1]+=W[6];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[22];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[7]=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-W[7]+=fw0;
-Vals[0]+=W[7];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[23];
-
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[8]=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-W[8]+=fw1;
-Vals[7]+=W[8];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[24];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[9]=W[2];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[25];
-
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[10]=W[3];
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=W[10];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[26];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[11]=W[4];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[27];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[12]=W[5];
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[3]+=W[12];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[28];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[13]=W[6];
-W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
-Vals[2]+=W[13];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[29];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[14]=0x00a00055U;
-W[14]+=W[7];
-W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
-Vals[1]+=W[14];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[30];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[15]=fw15;
-W[15]+=W[8];
-W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
-Vals[0]+=W[15];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[31];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[0]=fw01r;
-W[0]+=W[9];
-W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
-Vals[7]+=W[0];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[32];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[1]=fw1;
-W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
-W[1]+=W[10];
-W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
-Vals[6]+=W[1];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[33];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
-W[2]+=W[11];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
-Vals[5]+=K[34];
-Vals[5]+=W[2];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
-W[3]+=W[12];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[35];
-W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
-Vals[4]+=W[3];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
-W[4]+=W[13];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-Vals[3]+=K[36];
-Vals[3]+=W[4];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
-W[5]+=W[14];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[37];
-W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=W[5];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
-W[6]+=W[15];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-Vals[1]+=K[38];
-Vals[1]+=W[6];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
-W[7]+=W[0];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[39];
-W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-Vals[0]+=W[7];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
-W[8]+=W[1];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-Vals[7]+=K[40];
-Vals[7]+=W[8];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
-W[9]+=W[2];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[41];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
-W[10]+=W[3];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=K[42];
-Vals[5]+=W[10];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
-W[11]+=W[4];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[43];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
-W[12]+=W[5];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[3]+=K[44];
-Vals[3]+=W[12];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
-W[13]+=W[6];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[45];
-W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
-Vals[2]+=W[13];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-W[14]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
-W[14]+=W[7];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
-Vals[1]+=K[46];
-Vals[1]+=W[14];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
-W[15]+=W[8];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[47];
-W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
-Vals[0]+=W[15];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
-W[0]+=W[9];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
-Vals[7]+=K[48];
-Vals[7]+=W[0];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
-W[1]+=W[10];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[49];
-W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
-Vals[6]+=W[1];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
-W[2]+=W[11];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
-Vals[5]+=K[50];
-Vals[5]+=W[2];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
-W[3]+=W[12];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[51];
-W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
-Vals[4]+=W[3];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
-W[4]+=W[13];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-Vals[3]+=K[52];
-Vals[3]+=W[4];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
-W[5]+=W[14];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[53];
-W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=W[5];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
-W[6]+=W[15];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-Vals[1]+=K[54];
-Vals[1]+=W[6];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
-W[7]+=W[0];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[55];
-W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-Vals[0]+=W[7];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
-W[8]+=W[1];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-Vals[7]+=K[56];
-Vals[7]+=W[8];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
-W[9]+=W[2];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[57];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
-W[10]+=W[3];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=K[58];
-Vals[5]+=W[10];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
-W[11]+=W[4];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[59];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
-W[12]+=W[5];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[3]+=K[60];
-Vals[3]+=W[12];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
-W[13]+=W[6];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[61];
-W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
-Vals[2]+=W[13];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-W[14]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
-W[14]+=W[7];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
-Vals[1]+=K[62];
-Vals[1]+=W[14];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
-W[15]+=W[8];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[63];
-W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
-Vals[0]+=W[15];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-
-W[0]=Vals[0];
-
-W[7]=state7;
-W[7]+=Vals[7];
-
-Vals[7]=0xF377ED68U;
-W[0]+=state0;
-Vals[7]+=W[0];
-
-W[3]=state3;
-W[3]+=Vals[3];
-
-Vals[3]=0xa54ff53aU;
-Vals[3]+=Vals[7];
-
-W[1]=Vals[1];
-W[1]+=state1;
-
-W[6]=state6;
-W[6]+=Vals[6];
-
-Vals[6]=0x90BB1E3CU;
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=(0x9b05688cU^(Vals[3]&0xca0b3af3U));
-
-W[2]=state2;
-W[2]+=Vals[2];
-
-Vals[2]=0x3c6ef372U;
-Vals[6]+=W[1];
-Vals[2]+=Vals[6];
-Vals[7]+=0x08909ae5U;
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[5]=state5;
-W[5]+=Vals[5];
-
-Vals[5]=0x50C6645BU;
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],0x510e527fU);
-Vals[5]+=W[2];
-
-Vals[1]=0xbb67ae85U;
-Vals[1]+=Vals[5];
-Vals[6]+=Ma2(0xbb67ae85U,Vals[7],0x6a09e667U);
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-
-W[4]=state4;
-W[4]+=Vals[4];
-
-Vals[4]=0x3AC42E24U;
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=W[3];
-
-Vals[0]=Vals[4];
-Vals[0]+=0x6a09e667U;
-Vals[5]+=Ma2(0x6a09e667U,Vals[6],Vals[7]);
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[4];
-Vals[3]+=W[4];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[5];
-Vals[2]+=W[5];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[6];
-Vals[1]+=W[6];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[7];
-Vals[0]+=W[7];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=0x5807AA98U;
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[9];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[10];
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[11];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[12];
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[13];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[14];
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=0xC19BF274U;
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
-Vals[7]+=K[16];
-Vals[7]+=W[0];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
-W[1]+=0x00a00000U;
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[17];
-Vals[6]+=W[1];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
-W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[18];
-Vals[5]+=W[2];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
-W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[19];
-Vals[4]+=W[3];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
-W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[20];
-Vals[3]+=W[4];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
-W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[21];
-Vals[2]+=W[5];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
-W[6]+=0x00000100U;
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-Vals[1]+=K[22];
-Vals[1]+=W[6];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-W[7]+=0x11002000U;
-W[7]+=W[0];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[23];
-W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-Vals[0]+=W[7];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[8]=0x80000000U;
-W[8]+=W[1];
-W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-Vals[7]+=W[8];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[24];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[9]=W[2];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[25];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[10]=W[3];
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=W[10];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[26];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[11]=W[4];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[27];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[12]=W[5];
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[3]+=W[12];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[28];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[13]=W[6];
-W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
-Vals[2]+=W[13];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[29];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[14]=0x00400022U;
-W[14]+=W[7];
-W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
-Vals[1]+=W[14];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[30];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[15]=0x00000100U;
-W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
-W[15]+=W[8];
-W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
-Vals[0]+=W[15];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[31];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
-W[0]+=W[9];
-W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
-Vals[7]+=W[0];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[32];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
-W[1]+=W[10];
-W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
-Vals[6]+=W[1];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[33];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
-W[2]+=W[11];
-W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
-Vals[5]+=W[2];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[34];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
-W[3]+=W[12];
-W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
-Vals[4]+=W[3];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[35];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
-W[4]+=W[13];
-W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-Vals[3]+=W[4];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[36];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
-W[5]+=W[14];
-W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=W[5];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[37];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
-W[6]+=W[15];
-W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-Vals[1]+=W[6];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[38];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
-W[7]+=W[0];
-W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-Vals[0]+=W[7];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[39];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
-W[8]+=W[1];
-W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-Vals[7]+=W[8];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[40];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
-W[9]+=W[2];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[41];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
-W[10]+=W[3];
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=W[10];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[42];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
-W[11]+=W[4];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[43];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
-W[12]+=W[5];
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[3]+=W[12];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[44];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
-W[13]+=W[6];
-W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
-Vals[2]+=W[13];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[45];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[14]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
-W[14]+=W[7];
-W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
-Vals[1]+=W[14];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[46];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
-W[15]+=W[8];
-W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
-Vals[0]+=W[15];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[47];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
-W[0]+=W[9];
-W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
-Vals[7]+=W[0];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[48];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
-W[1]+=W[10];
-W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
-Vals[6]+=W[1];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[49];
-Vals[2]+=Vals[6];
-Vals[6]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
-
-W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
-W[2]+=W[11];
-W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
-Vals[5]+=W[2];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[50];
-Vals[6]+=Ma(Vals[1],Vals[7],Vals[0]);
-Vals[1]+=Vals[5];
-Vals[5]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
-Vals[5]+=Ma(Vals[0],Vals[6],Vals[7]);
-
-W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
-W[3]+=W[12];
-W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
-Vals[4]+=W[3];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[51];
-Vals[0]+=Vals[4];
-Vals[4]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
-
-W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
-W[4]+=W[13];
-W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
-Vals[3]+=W[4];
-Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[3]+=ch(Vals[0],Vals[1],Vals[2]);
-Vals[3]+=K[52];
-Vals[4]+=Ma(Vals[7],Vals[5],Vals[6]);
-Vals[7]+=Vals[3];
-Vals[3]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
-Vals[3]+=Ma(Vals[6],Vals[4],Vals[5]);
-
-W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
-W[5]+=W[14];
-W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
-Vals[2]+=W[5];
-Vals[2]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
-Vals[2]+=ch(Vals[7],Vals[0],Vals[1]);
-Vals[2]+=K[53];
-Vals[6]+=Vals[2];
-Vals[2]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
-
-W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
-W[6]+=W[15];
-W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
-Vals[1]+=W[6];
-Vals[1]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
-Vals[1]+=ch(Vals[6],Vals[7],Vals[0]);
-Vals[1]+=K[54];
-Vals[2]+=Ma(Vals[5],Vals[3],Vals[4]);
-Vals[5]+=Vals[1];
-Vals[1]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
-Vals[1]+=Ma(Vals[4],Vals[2],Vals[3]);
-
-W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
-W[7]+=W[0];
-W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
-Vals[0]+=W[7];
-Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
-Vals[0]+=ch(Vals[5],Vals[6],Vals[7]);
-Vals[0]+=K[55];
-Vals[4]+=Vals[0];
-Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
-
-W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
-W[8]+=W[1];
-W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
-Vals[7]+=W[8];
-Vals[7]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
-Vals[7]+=ch(Vals[4],Vals[5],Vals[6]);
-Vals[7]+=K[56];
-Vals[0]+=Ma(Vals[3],Vals[1],Vals[2]);
-Vals[3]+=Vals[7];
-Vals[7]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
-Vals[7]+=Ma(Vals[2],Vals[0],Vals[1]);
-
-W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
-W[9]+=W[2];
-W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
-Vals[6]+=W[9];
-Vals[6]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
-Vals[6]+=ch(Vals[3],Vals[4],Vals[5]);
-Vals[6]+=K[57];
-
-W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
-W[10]+=W[3];
-W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
-Vals[5]+=W[10];
-Vals[2]+=Vals[6];
-Vals[5]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
-Vals[5]+=ch(Vals[2],Vals[3],Vals[4]);
-Vals[5]+=K[58];
-
-W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
-W[11]+=W[4];
-W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
-Vals[4]+=W[11];
-Vals[1]+=Vals[5];
-Vals[4]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
-Vals[4]+=ch(Vals[1],Vals[2],Vals[3]);
-Vals[4]+=K[59];
-
-W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
-W[12]+=W[5];
-W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
-Vals[7]+=W[12];
-Vals[0]+=Vals[4];
-Vals[7]+=Vals[3];
-Vals[7]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
-Vals[7]+=ch(Vals[0],Vals[1],Vals[2]);
-//Vals[7]+=K[60]; diffed from 0xA41F32E7
-
-#define FOUND (0x80)
-#define NFLAG (0x7F)
-
-#if defined(VECTORS4)
-	Vals[7] ^= 0x136032edU;
-
-	bool result = Vals[7].x & Vals[7].y & Vals[7].z & Vals[7].w;
-
-	if (!result) {
-		if (!Vals[7].x)
-			output[FOUND] = output[NFLAG & nonce.x] = nonce.x;
-		if (!Vals[7].y)
-			output[FOUND] = output[NFLAG & nonce.y] = nonce.y;
-		if (!Vals[7].z)
-			output[FOUND] = output[NFLAG & nonce.z] = nonce.z;
-		if (!Vals[7].w)
-			output[FOUND] = output[NFLAG & nonce.w] = nonce.w;
-	}
-#elif defined VECTORS2
-	Vals[7] ^= 0x136032edU;
-
-	bool result = Vals[7].x & Vals[7].y;
-
-	if (!result) {
-		if (!Vals[7].x)
-			output[FOUND] = output[FOUND] = output[NFLAG & nonce.x] = nonce.x;
-		if (!Vals[7].y)
-			output[FOUND] = output[FOUND] = output[NFLAG & nonce.y] = nonce.y;
-	}
-#else
-	if (Vals[7] == 0x136032edU)
-		output[FOUND] = output[NFLAG & nonce] =  nonce;
-#endif
-}

+ 1353 - 0
poclbm120327.cl

@@ -0,0 +1,1353 @@
+// -ck modified kernel taken from Phoenix taken from poclbm, with aspects of
+// phatk and others.
+// Modified version copyright 2011-2012 Con Kolivas
+
+// This file is taken and modified from the public-domain poclbm project, and
+// we have therefore decided to keep it public-domain in Phoenix.
+
+#ifdef VECTORS4
+	typedef uint4 u;
+#elif defined VECTORS2
+	typedef uint2 u;
+#else
+	typedef uint u;
+#endif
+
+__constant uint K[64] = { 
+    0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
+    0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
+    0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
+    0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
+    0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
+    0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
+    0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
+    0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
+};
+
+
+// This part is not from the stock poclbm kernel. It's part of an optimization
+// added in the Phoenix Miner.
+
+// Some AMD devices have a BFI_INT opcode, which behaves exactly like the
+// SHA-256 ch function, but provides it in exactly one instruction. If
+// detected, use it for ch. Otherwise, construct ch out of simpler logical
+// primitives.
+
+#ifdef BITALIGN
+	#pragma OPENCL EXTENSION cl_amd_media_ops : enable
+	#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
+#else
+	#define rotr(x, y) rotate((u)x, (u)(32 - y))
+#endif
+#ifdef BFI_INT
+	// Well, slight problem... It turns out BFI_INT isn't actually exposed to
+	// OpenCL (or CAL IL for that matter) in any way. However, there is 
+	// a similar instruction, BYTE_ALIGN_INT, which is exposed to OpenCL via
+	// amd_bytealign, takes the same inputs, and provides the same output. 
+	// We can use that as a placeholder for BFI_INT and have the application 
+	// patch it after compilation.
+	
+	// This is the BFI_INT function
+	#define ch(x, y, z) amd_bytealign(x, y, z)
+	
+	// Ma can also be implemented in terms of BFI_INT...
+	#define Ma(x, y, z) amd_bytealign( (z^x), (y), (x) )
+
+	// AMD's KernelAnalyzer throws errors compiling the kernel if we use
+	// amd_bytealign on constants with vectors enabled, so we use this to avoid
+	// problems. (this is used 4 times, and likely optimized out by the compiler.)
+	#define Ma2(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
+#else // BFI_INT
+	//GCN actually fails if manually patched with BFI_INT
+
+	#define ch(x, y, z) bitselect((u)z, (u)y, (u)x)
+	#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
+	#define Ma2(x, y, z) Ma(x, y, z)
+#endif
+
+
+__kernel
+__attribute__((vec_type_hint(u)))
+__attribute__((reqd_work_group_size(WORKSIZE, 1, 1)))
+void search(const uint state0, const uint state1, const uint state2, const uint state3,
+	const uint state4, const uint state5, const uint state6, const uint state7,
+	const uint b1, const uint c1,
+	const uint f1, const uint g1, const uint h1,
+#ifndef GOFFSET
+	const u base,
+#endif
+	const uint fw0, const uint fw1, const uint fw2, const uint fw3, const uint fw15, const uint fw01r,
+	const uint D1A, const uint C1addK5, const uint B1addK6,
+	const uint W16addK16, const uint W17addK17,
+	const uint PreVal4addT1, const uint Preval0,
+	__global uint * output)
+{
+	u Vals[24];
+	u *W = &Vals[8];
+
+#ifdef GOFFSET
+	const u nonce = (uint)(get_global_id(0));
+#else
+	const u nonce = base + (uint)(get_global_id(0));
+#endif
+
+Vals[5]=Preval0;
+Vals[5]+=nonce;
+
+Vals[0]=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],b1,c1);
+Vals[0]+=D1A;
+
+Vals[2]=Vals[0];
+Vals[2]+=h1;
+
+Vals[1]=PreVal4addT1;
+Vals[1]+=nonce;
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+
+Vals[6]=C1addK5;
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],b1);
+
+Vals[3]=Vals[6];
+Vals[3]+=g1;
+Vals[0]+=Ma2(g1,Vals[1],f1);
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma2(f1,Vals[0],Vals[1]);
+
+Vals[7]=B1addK6;
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+
+Vals[4]=Vals[7];
+Vals[4]+=f1;
+
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[7];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[8];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[9];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[10];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[11];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[12];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[13];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[14];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=0xC19BF3F4U;
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=W16addK16;
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=W17addK17;
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]=(rotr(nonce,7)^rotr(nonce,18)^(nonce>>3U));
+W[2]+=fw2;
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[18];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]=nonce;
+W[3]+=fw3;
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[19];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+W[4]+=0x80000000U;
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[20];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[21];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+W[6]+=0x00000280U;
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[22];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+W[7]+=fw0;
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[23];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+W[8]+=fw1;
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[24];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[9]=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[25];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[10]=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[26];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[11]=W[4];
+W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=W[11];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[27];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[12]=W[5];
+W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+Vals[0]+=W[12];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[28];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[13]=W[6];
+W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
+Vals[6]+=W[13];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[29];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[14]=0x00a00055U;
+W[14]+=W[7];
+W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
+Vals[7]+=W[14];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[30];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[15]=fw15;
+W[15]+=W[8];
+W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
+Vals[5]+=W[15];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[31];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[0]=fw01r;
+W[0]+=W[9];
+W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
+Vals[2]+=W[0];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[32];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[1]=fw1;
+W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
+W[1]+=W[10];
+W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
+Vals[3]+=W[1];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[33];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
+W[2]+=W[11];
+W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[34];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
+W[3]+=W[12];
+W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[35];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
+W[4]+=W[13];
+W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[36];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
+W[5]+=W[14];
+W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[37];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
+W[6]+=W[15];
+W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[38];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
+W[7]+=W[0];
+W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[39];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
+W[8]+=W[1];
+W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[40];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
+W[9]+=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[41];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
+W[10]+=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[42];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
+W[11]+=W[4];
+W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=W[11];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[43];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
+W[12]+=W[5];
+W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+Vals[0]+=W[12];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[44];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
+W[13]+=W[6];
+W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
+Vals[6]+=W[13];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[45];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[14]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
+W[14]+=W[7];
+W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
+Vals[7]+=W[14];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[46];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
+W[15]+=W[8];
+W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
+Vals[5]+=W[15];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[47];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
+W[0]+=W[9];
+W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
+Vals[2]+=W[0];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[48];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
+W[1]+=W[10];
+W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
+Vals[3]+=W[1];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[49];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
+W[2]+=W[11];
+W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[50];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
+W[3]+=W[12];
+W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[51];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
+W[4]+=W[13];
+W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[52];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
+W[5]+=W[14];
+W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[53];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
+W[6]+=W[15];
+W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[54];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
+W[7]+=W[0];
+W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[55];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
+W[8]+=W[1];
+W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[56];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
+W[9]+=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[57];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
+W[10]+=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[58];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
+W[11]+=W[4];
+W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=W[11];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[59];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
+W[12]+=W[5];
+W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+Vals[0]+=W[12];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[60];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
+W[13]+=W[6];
+W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
+Vals[6]+=W[13];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[61];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+Vals[7]+=W[14];
+Vals[7]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
+Vals[7]+=W[7];
+Vals[7]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[62];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+Vals[5]+=W[15];
+Vals[5]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
+Vals[5]+=W[8];
+Vals[5]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[63];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+Vals[5]+=state0;
+
+W[7]=state7;
+W[7]+=Vals[2];
+
+Vals[2]=0xF377ED68U;
+Vals[2]+=Vals[5];
+
+W[3]=state3;
+W[3]+=Vals[0];
+
+Vals[0]=0xa54ff53aU;
+Vals[0]+=Vals[2];
+Vals[2]+=0x08909ae5U;
+
+W[6]=state6;
+W[6]+=Vals[3];
+
+Vals[3]=0x90BB1E3CU;
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=(0x9b05688cU^(Vals[0]&0xca0b3af3U));
+
+Vals[7]+=state1;
+Vals[3]+=Vals[7];
+
+W[2]=state2;
+W[2]+=Vals[6];
+
+Vals[6]=0x3c6ef372U;
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma2(0xbb67ae85U,Vals[2],0x6a09e667U);
+
+W[5]=state5;
+W[5]+=Vals[4];
+
+Vals[4]=0x50C6645BU;
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],0x510e527fU);
+Vals[4]+=W[2];
+
+W[1]=Vals[7];
+Vals[7]=0xbb67ae85U;
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma2(0x6a09e667U,Vals[3],Vals[2]);
+
+W[4]=state4;
+W[4]+=Vals[1];
+
+Vals[1]=0x3AC42E24U;
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=W[3];
+
+W[0]=Vals[5];
+
+Vals[5]=Vals[1];
+Vals[5]+=0x6a09e667U;
+
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[4];
+Vals[0]+=W[4];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[5];
+Vals[6]+=W[5];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[6];
+Vals[7]+=W[6];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[7];
+Vals[5]+=W[7];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=0x5807AA98U;
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[9];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[10];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[11];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[12];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[13];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[14];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=0xC19BF274U;
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
+Vals[2]+=W[0];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[16];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
+W[1]+=0x00a00000U;
+Vals[3]+=W[1];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[17];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
+W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[18];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
+W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[19];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
+W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[20];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
+W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[21];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
+W[6]+=0x00000100U;
+W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[22];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]+=0x11002000U;
+W[7]+=W[0];
+W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[23];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]=0x80000000U;
+W[8]+=W[1];
+W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[24];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[9]=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[25];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[10]=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[26];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[11]=W[4];
+W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=W[11];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[27];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[12]=W[5];
+W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+Vals[0]+=W[12];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[28];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[13]=W[6];
+W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
+Vals[6]+=W[13];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[29];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[14]=0x00400022U;
+W[14]+=W[7];
+W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
+Vals[7]+=W[14];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[30];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[15]=0x00000100U;
+W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
+W[15]+=W[8];
+W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
+Vals[5]+=W[15];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[31];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
+W[0]+=W[9];
+W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
+Vals[2]+=W[0];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[32];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
+W[1]+=W[10];
+W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
+Vals[3]+=W[1];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[33];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
+W[2]+=W[11];
+W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[34];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
+W[3]+=W[12];
+W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[35];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
+W[4]+=W[13];
+W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[36];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
+W[5]+=W[14];
+W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[37];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
+W[6]+=W[15];
+W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[38];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
+W[7]+=W[0];
+W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[39];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
+W[8]+=W[1];
+W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[40];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
+W[9]+=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[41];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
+W[10]+=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[42];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[11]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
+W[11]+=W[4];
+W[11]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=W[11];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[43];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[12]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
+W[12]+=W[5];
+W[12]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+Vals[0]+=W[12];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[44];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[13]+=(rotr(W[14],7)^rotr(W[14],18)^(W[14]>>3U));
+W[13]+=W[6];
+W[13]+=(rotr(W[11],17)^rotr(W[11],19)^(W[11]>>10U));
+Vals[6]+=W[13];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[45];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[14]+=(rotr(W[15],7)^rotr(W[15],18)^(W[15]>>3U));
+W[14]+=W[7];
+W[14]+=(rotr(W[12],17)^rotr(W[12],19)^(W[12]>>10U));
+Vals[7]+=W[14];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[46];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[15]+=(rotr(W[0],7)^rotr(W[0],18)^(W[0]>>3U));
+W[15]+=W[8];
+W[15]+=(rotr(W[13],17)^rotr(W[13],19)^(W[13]>>10U));
+Vals[5]+=W[15];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[47];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[0]+=(rotr(W[1],7)^rotr(W[1],18)^(W[1]>>3U));
+W[0]+=W[9];
+W[0]+=(rotr(W[14],17)^rotr(W[14],19)^(W[14]>>10U));
+Vals[2]+=W[0];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[48];
+Vals[0]+=Vals[2];
+Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+
+W[1]+=(rotr(W[2],7)^rotr(W[2],18)^(W[2]>>3U));
+W[1]+=W[10];
+W[1]+=(rotr(W[15],17)^rotr(W[15],19)^(W[15]>>10U));
+Vals[3]+=W[1];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[49];
+Vals[6]+=Vals[3];
+Vals[3]+=(rotr(Vals[2],2)^rotr(Vals[2],13)^rotr(Vals[2],22));
+Vals[3]+=Ma(Vals[7],Vals[2],Vals[5]);
+
+W[2]+=(rotr(W[3],7)^rotr(W[3],18)^(W[3]>>3U));
+W[2]+=W[11];
+W[2]+=(rotr(W[0],17)^rotr(W[0],19)^(W[0]>>10U));
+Vals[4]+=W[2];
+Vals[4]+=(rotr(Vals[6],6)^rotr(Vals[6],11)^rotr(Vals[6],25));
+Vals[4]+=ch(Vals[6],Vals[0],Vals[1]);
+Vals[4]+=K[50];
+Vals[7]+=Vals[4];
+Vals[4]+=(rotr(Vals[3],2)^rotr(Vals[3],13)^rotr(Vals[3],22));
+Vals[4]+=Ma(Vals[5],Vals[3],Vals[2]);
+
+W[3]+=(rotr(W[4],7)^rotr(W[4],18)^(W[4]>>3U));
+W[3]+=W[12];
+W[3]+=(rotr(W[1],17)^rotr(W[1],19)^(W[1]>>10U));
+Vals[1]+=W[3];
+Vals[1]+=(rotr(Vals[7],6)^rotr(Vals[7],11)^rotr(Vals[7],25));
+Vals[1]+=ch(Vals[7],Vals[6],Vals[0]);
+Vals[1]+=K[51];
+Vals[5]+=Vals[1];
+Vals[1]+=(rotr(Vals[4],2)^rotr(Vals[4],13)^rotr(Vals[4],22));
+Vals[1]+=Ma(Vals[2],Vals[4],Vals[3]);
+
+W[4]+=(rotr(W[5],7)^rotr(W[5],18)^(W[5]>>3U));
+W[4]+=W[13];
+W[4]+=(rotr(W[2],17)^rotr(W[2],19)^(W[2]>>10U));
+Vals[0]+=W[4];
+Vals[0]+=(rotr(Vals[5],6)^rotr(Vals[5],11)^rotr(Vals[5],25));
+Vals[0]+=ch(Vals[5],Vals[7],Vals[6]);
+Vals[0]+=K[52];
+Vals[2]+=Vals[0];
+Vals[0]+=(rotr(Vals[1],2)^rotr(Vals[1],13)^rotr(Vals[1],22));
+Vals[0]+=Ma(Vals[3],Vals[1],Vals[4]);
+
+W[5]+=(rotr(W[6],7)^rotr(W[6],18)^(W[6]>>3U));
+W[5]+=W[14];
+W[5]+=(rotr(W[3],17)^rotr(W[3],19)^(W[3]>>10U));
+Vals[6]+=W[5];
+Vals[6]+=(rotr(Vals[2],6)^rotr(Vals[2],11)^rotr(Vals[2],25));
+Vals[6]+=ch(Vals[2],Vals[5],Vals[7]);
+Vals[6]+=K[53];
+Vals[3]+=Vals[6];
+Vals[6]+=(rotr(Vals[0],2)^rotr(Vals[0],13)^rotr(Vals[0],22));
+Vals[6]+=Ma(Vals[4],Vals[0],Vals[1]);
+
+W[6]+=(rotr(W[7],7)^rotr(W[7],18)^(W[7]>>3U));
+W[6]+=W[15];
+W[6]+=(rotr(W[4],17)^rotr(W[4],19)^(W[4]>>10U));
+Vals[7]+=W[6];
+Vals[7]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[7]+=ch(Vals[3],Vals[2],Vals[5]);
+Vals[7]+=K[54];
+Vals[4]+=Vals[7];
+Vals[7]+=(rotr(Vals[6],2)^rotr(Vals[6],13)^rotr(Vals[6],22));
+Vals[7]+=Ma(Vals[1],Vals[6],Vals[0]);
+
+W[7]+=(rotr(W[8],7)^rotr(W[8],18)^(W[8]>>3U));
+W[7]+=W[0];
+W[7]+=(rotr(W[5],17)^rotr(W[5],19)^(W[5]>>10U));
+Vals[5]+=W[7];
+Vals[5]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[5]+=ch(Vals[4],Vals[3],Vals[2]);
+Vals[5]+=K[55];
+Vals[1]+=Vals[5];
+Vals[5]+=(rotr(Vals[7],2)^rotr(Vals[7],13)^rotr(Vals[7],22));
+Vals[5]+=Ma(Vals[0],Vals[7],Vals[6]);
+
+W[8]+=(rotr(W[9],7)^rotr(W[9],18)^(W[9]>>3U));
+W[8]+=W[1];
+W[8]+=(rotr(W[6],17)^rotr(W[6],19)^(W[6]>>10U));
+Vals[2]+=W[8];
+Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+Vals[2]+=K[56];
+Vals[0]+=Vals[2];
+
+W[9]+=(rotr(W[10],7)^rotr(W[10],18)^(W[10]>>3U));
+W[9]+=W[2];
+W[9]+=(rotr(W[7],17)^rotr(W[7],19)^(W[7]>>10U));
+Vals[3]+=W[9];
+Vals[3]+=(rotr(Vals[0],6)^rotr(Vals[0],11)^rotr(Vals[0],25));
+Vals[3]+=ch(Vals[0],Vals[1],Vals[4]);
+Vals[3]+=K[57];
+Vals[3]+=Vals[6];
+
+W[10]+=(rotr(W[11],7)^rotr(W[11],18)^(W[11]>>3U));
+W[10]+=W[3];
+W[10]+=(rotr(W[8],17)^rotr(W[8],19)^(W[8]>>10U));
+Vals[4]+=W[10];
+Vals[4]+=(rotr(Vals[3],6)^rotr(Vals[3],11)^rotr(Vals[3],25));
+Vals[4]+=ch(Vals[3],Vals[0],Vals[1]);
+Vals[4]+=K[58];
+Vals[4]+=Vals[7];
+Vals[1]+=(rotr(Vals[4],6)^rotr(Vals[4],11)^rotr(Vals[4],25));
+Vals[1]+=ch(Vals[4],Vals[3],Vals[0]);
+Vals[1]+=W[11];
+Vals[1]+=(rotr(W[12],7)^rotr(W[12],18)^(W[12]>>3U));
+Vals[1]+=W[4];
+Vals[1]+=(rotr(W[9],17)^rotr(W[9],19)^(W[9]>>10U));
+Vals[1]+=K[59];
+Vals[1]+=Vals[5];
+
+#define FOUND (0x80)
+#define NFLAG (0x7F)
+
+#if defined(VECTORS2) || defined(VECTORS4)
+	Vals[2]+=Ma(Vals[6],Vals[5],Vals[7]);
+	Vals[2]+=(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22));
+	Vals[2]+=W[12];
+	Vals[2]+=(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U));
+	Vals[2]+=W[5];
+	Vals[2]+=(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U));
+	Vals[2]+=Vals[0];
+	Vals[2]+=(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25));
+	Vals[2]+=ch(Vals[1],Vals[4],Vals[3]);
+
+	if (any(Vals[2] == 0x136032edU)) {
+		if (Vals[2].x == 0x136032edU)
+			output[FOUND] = output[NFLAG & nonce.x] = nonce.x;
+		if (Vals[2].y == 0x136032edU)
+			output[FOUND] = output[NFLAG & nonce.y] = nonce.y;
+#if defined(VECTORS4)
+		if (Vals[2].z == 0x136032edU)
+			output[FOUND] = output[NFLAG & nonce.z] = nonce.z;
+		if (Vals[2].w == 0x136032edU)
+			output[FOUND] = output[NFLAG & nonce.w] = nonce.w;
+#endif
+	}
+#else
+	if ((Vals[2]+
+		Ma(Vals[6],Vals[5],Vals[7])+
+		(rotr(Vals[5],2)^rotr(Vals[5],13)^rotr(Vals[5],22))+
+		W[12]+
+		(rotr(W[13],7)^rotr(W[13],18)^(W[13]>>3U))+
+		W[5]+
+		(rotr(W[10],17)^rotr(W[10],19)^(W[10]>>10U))+
+		Vals[0]+
+		(rotr(Vals[1],6)^rotr(Vals[1],11)^rotr(Vals[1],25))+
+		ch(Vals[1],Vals[4],Vals[3])) == 0x136032edU)
+			output[FOUND] = output[NFLAG & nonce] =  nonce;
+#endif
+}

+ 13 - 5
util.c

@@ -19,7 +19,6 @@
 #include <jansson.h>
 #include <curl/curl.h>
 #include <time.h>
-#include <curses.h>
 #include <errno.h>
 #include <unistd.h>
 #include <sys/types.h>
@@ -31,6 +30,7 @@
 # include <winsock2.h>
 # include <mstcpip.h>
 #endif
+
 #include "miner.h"
 #include "elist.h"
 #include "compat.h"
@@ -365,10 +365,16 @@ json_t *json_rpc_call(CURL *curl, const char *url,
 	if (probing) {
 		pool->probed = true;
 		/* If X-Long-Polling was found, activate long polling */
-		if (hi.lp_path)
+		if (hi.lp_path) {
+			if (pool->hdr_path != NULL)
+				free(pool->hdr_path);
 			pool->hdr_path = hi.lp_path;
-		else
+		} else {
 			pool->hdr_path = NULL;
+		}
+	} else if (hi.lp_path) {
+		free(hi.lp_path);
+		hi.lp_path = NULL;
 	}
 
 	*rolltime = hi.has_rolltime;
@@ -411,9 +417,11 @@ json_t *json_rpc_call(CURL *curl, const char *url,
 		goto err_out;
 	}
 
-	if (hi.reason)
+	if (hi.reason) {
 		json_object_set_new(val, "reject-reason", json_string(hi.reason));
-
+		free(hi.reason);
+		hi.reason = NULL;
+	}
 	successful_connect = true;
 	databuf_free(&all_data);
 	curl_slist_free_all(headers);

+ 224 - 0
windows-build.txt

@@ -0,0 +1,224 @@
+######################################################################################
+#                                                                                    #
+#          Native WIN32 setup and build instructions (on mingw32/Windows):           #
+#                                                                                    #
+######################################################################################
+
+**************************************************************************************
+* Introduction                                                                       *
+**************************************************************************************
+The following instructions have been tested on both Windows 7 and Windows XP.
+Most of what is described below (copying files, downloading files, etc.) can be done
+directly in the MinGW MSYS shell; these instructions do not do so because package
+versions and links change over time. The best way is to use your browser, go to the
+links directly, and see for yourself which versions you want to install.
+
+If you think that this documentation was helpful and you wish to donate, you can 
+do so at the following address. 12KaKtrK52iQjPdtsJq7fJ7smC32tXWbWr
+
+**************************************************************************************
+* A tip that might help you along the way                                            *
+**************************************************************************************
+Enable "QuickEdit Mode" in your Command Prompt Window or MinGW Command Prompt
+Window (No need to go into the context menu to choose edit-mark/copy/paste):
+Right-click on the title bar and click Properties. Under the Options tab, check
+the box for "QuickEdit Mode". Alternately, if you want this change to be
+permanent on all of your Command Prompt Windows; you can click Defaults instead
+of Properties as described above. Now you can drag and select text you want to
+copy, right-click to copy the text to the clipboard and right-click once again to
+paste it at the desired location. You could for example, copy some text from this
+document to the clipboard and right click in your Command Prompt Window to paste
+what you copied.
+
+**************************************************************************************
+* Install mingw32                                                                    *
+**************************************************************************************
+Go to this url ==> http://www.mingw.org/wiki/Getting_Started
+Click the link that says "Download and run the latest mingw-get-inst version."
+Download and run the latest file. Install MinGW in the default directory.
+(I downloaded the one labeled "mingw-get-inst-20111118" - note that this could 
+be a different version later.)
+Make sure to check the option for "Download latest repository catalogs".
+I just selected all the check boxes (excluding "Fortran Compiler") so that everything
+was installed.
+
+**************************************************************************************
+* Create mstcpip.h                                                                   *
+**************************************************************************************
+Open notepad and copy the following into it. Save it as "\MinGW\include\mstcpip.h".
+Make sure it does not have the ".txt" extension (If it does then rename it).
+
+struct tcp_keepalive
+{
+    u_long onoff;
+    u_long keepalivetime;
+    u_long keepaliveinterval;
+};
+
+#ifndef USE_WS_PREFIX
+
+#define SIO_KEEPALIVE_VALS    _WSAIOW(IOC_VENDOR, 4)
+
+#else
+
+#define WS_SIO_KEEPALIVE_VALS    _WSAIOW(WS_IOC_VENDOR, 4)
+
+#endif
+
+**************************************************************************************
+* Run the MSYS shell for the first time to create your user directory                *
+************************************************************************************** 
+(Start Icon/keyboard key ==> All Programs ==> MinGW ==> MinGW Shell).
+This will create your user directory for you.
+
+**************************************************************************************
+* Install libpdcurses                                                                *
+**************************************************************************************
+Type the lines below to install libpdcurses.
+mingw-get install mingw32-libpdcurses
+mingw-get install mingw32-pdcurses
+Ctrl-D or typing "logout" and pressing the enter key should get you out of the
+window.
+
+**************************************************************************************
+* Copy CGMiner source to your MSYS working directory                                 *
+**************************************************************************************
+Copy CGMiner source code directory into: 
+\MinGW\msys\1.0\home\(folder with your user name)
+
+**************************************************************************************
+* Install AMD APP SDK, latest version (only if you want GPU mining)                  *
+**************************************************************************************
+Note: You do not need to install the AMD APP SDK if you are only using Nvidia GPU's
+Go to this url for the latest AMD APP SDK: 
+ http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx
+Go to this url for legacy AMD APP SDK's:
+ http://developer.amd.com/sdks/AMDAPPSDK/downloads/pages/AMDAPPSDKDownloadArchive.aspx
+Download and install whichever version you like best.
+Copy the folders in \Program Files (x86)\AMD APP\include to \MinGW\include 
+Copy \Program Files (x86)\AMD APP\lib\x86\libOpenCL.a to \MinGW\lib
+Note: If you are on a 32 bit version of windows "Program Files (x86)" will be 
+"Program Files".
+Note2: If you update your APP SDK later you might want to recopy the above files 
+
+**************************************************************************************
+* Install AMD ADL SDK, latest version (only if you want GPU monitoring)              *
+**************************************************************************************
+Note: You do not need to install the AMD ADL SDK if you are only using Nvidia GPU's	
+Go to this url ==> http://developer.amd.com/sdks/ADLSDK/Pages/default.aspx
+Download and unzip the file you downloaded.
+Pull adl_defines.h, adl_sdk.h, and adl_structures.h out of the include folder 
+Put those files into the ADL_SDK folder in your source tree as shown below.
+\MinGW\msys\1.0\home\(folder with your user name)\cgminer-x.x.x\ADL_SDK
+
+**************************************************************************************
+* Install GTK-WIN, required for Pkg-config in the next step                          *
+**************************************************************************************
+Go to this url ==> http://sourceforge.net/projects/gtk-win/ 
+Download the file.
+After you have downloaded the file Double click/run it and this will install GTK+
+I chose all the selection boxes when I installed.
+Copy libglib-2.0-0.dll and intl.dll from \Program Files (x86)\gtk2-runtime\bin to 
+\MinGW\bin
+Note: If you are on a 32 bit version of windows "Program Files (x86)" will be 
+"Program Files".
+
+**************************************************************************************
+* Install pkg-config                                                                 *
+**************************************************************************************
+Go to this url ==> http://www.gtk.org/download/win32.php
+Scroll down to where it shows pkg-cfg.
+Download the file from the tool link. Extract "pkg-config.exe" from bin and place in
+your  \MinGW\bin directory.
+Download the file from the "Dev" link. Extract "pkg.m4" from share\aclocal and place
+in your \MingW\share\aclocal directory.
+		
+**************************************************************************************
+* Install libcurl                                                                    *
+**************************************************************************************
+Go to this url ==> http://curl.haxx.se/download.html#Win32
+At the section where it says "Win32 - Generic", Click on the link that indicates
+Win32 2000.XP 7.24.0 libcurl SSL and download it.
+The one I downloaded may not be current for you. Choose the latest.
+Extract the files that are in the zip (bin, include, and lib) to their respective
+locations in MinGW (\MinGW\bin, \MinGW\include, and \MinGW\lib).
+Edit the file \MinGW\lib\pkgconfig\libcurl.pc and change "-lcurl" to 
+"-lcurl -lcurldll".
+Ref. http://old.nabble.com/gcc-working-with-libcurl-td20506927.html
+
+**************************************************************************************
+* Build cgminer.exe                                                                  *
+**************************************************************************************
+Run the MinGW MSYS shell 
+(Start Icon/keyboard key ==> All Programs ==> MinGW ==> MinGW Shell).	
+Change the working directory to your CGMiner project folder.
+Example: cd cgminer-2.1.2 [Enter Key] if you are unsure then type "ls -la"
+Another way is to type "cd cg" and then press the tab key; It will auto fill.		
+Type the lines below one at a time. Look for problems after each one before going on
+to the next.
+
+      adl.sh (optional - see below)
+      autoreconf -fvi
+      CFLAGS="-O2 -msse2" ./configure (additional config options, see below)
+      make
+
+**************************************************************************************
+* Copy files to a build directory/folder                                             *
+**************************************************************************************
+Make a directory and copy the following files into it. This will be your CGMiner
+Folder that you use for mining. Remember the .cl filenames could change on later
+releases. If you installed a different version of libcurl then some of those dll's
+may be different as well.
+  cgminer.exe     from \MinGW\msys\1.0\home\(username)\cgminer-x.x.x 
+  *.cl            from \MinGW\msys\1.0\home\(username)\cgminer-x.x.x
+  README          from \MinGW\msys\1.0\home\(username)\cgminer-x.x.x
+  libcurl.dll     from \MinGW\bin
+  libeay32.dll    from \MinGW\bin
+  libidn-11.dll   from \MinGW\bin
+  libssl32.dll    from \MinGW\bin
+  libpdcurses.dll from \MinGW\bin
+  pthreadGC2.dll  from \MinGW\bin
+  
+**************************************************************************************
+* Optional - Install Git into MinGW/MSYS                                             *
+**************************************************************************************
+Go to this url ==> http://code.google.com/p/msysgit/
+Click on the Downloads tab.
+Download the latest "Portable" git archive.
+Extract the git*.exe files from the bin folder and put them into \MinGW\bin.
+Extract the share\git-core folder and place it into \MinGW\share.
+To test if it is working, open a MinGW shell and type the following:
+  git config -–global core.autocrlf false (note: one time run only)
+  git clone git://github.com/ckolivas/cgminer.git
+  
+If you simply just want to update the source after you have already cloned, type:
+  git pull git://github.com/ckolivas/cgminer.git
+
+Now you can get the latest source directly from github.
+
+**************************************************************************************
+* Optional - Make a .sh file to automate copying over ADL files                      *
+**************************************************************************************
+Make a folder/directory in your home folder and name it ADL_SDK.
+ (ref:  \MinGW\msys\1.0\home\(folder with your user name)\ADL_SDK)
+Copy the ADL .h files into that folder/directory.
+Open your favorite text editor and type the following into it.
+ cp -av ../ADL_SDK/*.h ADL_SDK
+Save the file as "adl.sh" and then place the file into "\MinGW\msys\1.0\bin".
+From now on when your current working directory is the cgminer source directory
+You can simply type "adl.sh" and it will place the ADL header files into place
+For you. Make sure you never remove the ADL_SDK folder from your home folder.
+
+**************************************************************************************
+* Some ./configure options                                                           *
+**************************************************************************************
+--disable-opencl        Override detection and disable building with opencl
+--disable-adl           Override detection and disable building with adl
+--enable-bitforce       Compile support for BitForce FPGAs(default disabled)
+--enable-icarus         Compile support for Icarus Board(default disabled)
+
+######################################################################################
+#                                                                                    #
+#       Native WIN32 setup and build instructions (on mingw32/Windows) complete      #
+#                                                                                    #
+######################################################################################

Some files were not shown because too many files changed in this diff